visualization - 'int' not iterable error in python While running pyldavis -
- i running code in ipython(running python version 2.7). found error while calling "pyldavis" funtion visualize lda generated topics. following error :
topics_df = pd.dataframe([dict((y,x) x, y in tuples) tuples in topics])typeerror: 'int' object not iterable
- input data in form of table has 2 columns "complaints_id" , "complaints_txt". trying run topic model on each of complaint_txt value
is issue python version or arguments passing function? below code.
from stop_words import get_stop_words import pandas pd import numpy np nltk import bigrams lib.lda import lda, visualizelda nltk.tokenize import regexptokenizer gensim import corpora, models import gensim import pyldavis.gensim #provide path name here mypath = " " allcomplaints = pd.read_csv(mypath) #combining complaints each id myremarks= allcomplaints.groupby(['complaint_id'])['complaint_txt'].agg(lambda x: ''.join(x)).values #create english stop words list en_stop = get_stop_words('en') #including domain specific stop words my_stopwords = ["xx","xxxx"] my_stopwords= [i.decode('utf-8') in my_stopwords] en_stop = en_stop +my_stopwords texts = [] doc in myremarks: raw = doc.lower() tokens = bigrams(i in tokenizer.tokenize(raw)if not in en_stop , len(i)>1) mergedtokens = [i[0]+" "+i[1] in tokens] stopped_tokens = [i in mergedtokens if not in en_stop] texts.append(stopped_tokens) dictionary = corpora.dictionary(texts) print dictionary # convert tokenized documents document-term matrix corpus = [dictionary.doc2bow(text) text in texts] # generate lda model ldamodel = gensim.models.ldamodel.ldamodel(corpus, num_topics = 5 , id2word = dictionary, passes = 1) print(ldamodel.print_topics(num_topics=5)) # visualize ldamodel vis= pyldavis.gensim.prepare(ldamodel,corpus,dictionary) pyldavis.display(vis)
# following sample of data using run lda:
complaint_id| complaint_txt ------------| -------------- 4545 | cust has billing issue 4545 | $480 6878 | connct issue day ne 6878 | ed immediate resoltn
Comments
Post a Comment