machine learning - Python NLTK Classifier.train(trainfeats)... ValueError: need more than 1 value to unpack -
def word_feats(words): return dict([(word, true) word in words]) tweet in negtweets: words = re.findall(r"[\w']+|[.,!?;]", tweet) #splits tweet words negwords = [(word_feats(words), 'neg')] #tag words feature negfeats.append(negwords) #add words feature list tweet in postweets: words = re.findall(r"[\w']+|[.,!?;]", tweet) poswords = [(word_feats(words), 'pos')] posfeats.append(poswords) negcutoff = len(negfeats)*3/4 #take 3/4ths of words poscutoff = len(posfeats)*3/4 trainfeats = negfeats[:negcutoff] + posfeats[:poscutoff] #assemble train set testfeats = negfeats[negcutoff:] + posfeats[poscutoff:] classifier = naivebayesclassifier.train(trainfeats) print 'accuracy:', nltk.classify.util.accuracy(classifier, testfeats) classifier.show_most_informative_features()
i getting following error when running code...
file "c:\python27\lib\nltk\classify\naivebayes.py", line 191, in train featureset, label in labeled_featuresets: valueerror: need more 1 value unpack
the error coming classifier = naivebayesclassifier.train(trainfeats) line , i'm not sure why. have done before, , trainfeats seams in same format then... sample format listed below...
[[({'me': true, 'af': true, 'this': true, 'joy': true, 'high': true, 'hookah': true, 'got': true}, 'pos')]]
what other value trainfeats need create classifier?emphasized text
the comment @prune right: labeled_featuresets
should sequence of pairs (two-element lists or tuples): feature dict , category each data point. instead, each element in trainfeats
list containing 1 element: tuple of 2 things. lose square brackets in both feature-building loops , part should work correctly. e.g.,
negwords = (word_feats(words), 'neg') negfeats.append(negwords)
two more things: consider using nltk.word_tokenize()
instead of doing own tokenization. , should randomize order of training data, e.g. random.scramble(trainfeats)
.
Comments
Post a Comment