错误:
使用gensim库和pyLDAvis库进行LDA主题模型结果可视化时,出现以下错误:
TypeError: doc2bow expects an array of unicode tokens on input, not a single string
源代码:
vis = pyLDAvis.gensim_models.prepare(lda_model, corpus, dictionary)
pyLDAvis.enable_notebook()
pyLDAvis.display(vis)
解决:
dictionary 需要一个标记化的字符串作为输入:
data = ['have fun ',
'drive car carefully',
'lily and janny']
# be sure to split sentence before feed into Dictionary
data = [d.split() for d in data]
dictionary = Dictionary(data)