使用gensim实现lda，并计算perplexity（ gensim Perplexity Estimates in LDA Model）

最新推荐文章于 2024-04-28 13:49:58 发布

叮当了个河蟹

最新推荐文章于 2024-04-28 13:49:58 发布

阅读量7.3k

点赞数

分类专栏：移动开发

本文链接：https://blog.csdn.net/qq_25073545/article/details/79773807

版权

使用gensim实现lda，并计算perplexity（ gensim Perplexity Estimates in LDA Model）
Neither. The values coming out of bound() depend on the number of topics (as well as number of words), so they’re not comparable across different num_topics (or different test corpora).
No, the opposite: a smaller bound value implies deterioration. For example, bound -6000 is “better” than -7000 (bigger is better
-====================================================

You can use method log_perplexity for evaluating your LdaModel

Small code example1

from gensim.models import LdaModel
from gensim.corpora import Dictionary
import numpy as np

docs = [["a", "a", "b"], 
        ["a", "c", "g"], 
        ["c"],
        ["a", "c", "g"]]

dct = Dictionary(docs)
corpus = [dct.doc2bow(_) for _ in docs]
c_train, c_test = corpus[:2], corpus[2:]

ldamodel = LdaModel(corpus=c_train, num_topics=2, id2word=dct)
Per-word Perplexity=ldamodel.log_perplexity(c_test)
print(Per-word Perplexity)

Small code example1

I am attempting to estimate an LDA topicmodel for a corpus of ~59,000 documents and ~500,000 unique tokens. I would prefer to estimate the final model in R to utilize its visualization tools for interpreting my results, however firs

最低0.47元/天解锁文章

叮当了个河蟹

关注

0
点赞
踩
11

收藏

觉得还不错? 一键收藏
2
评论
使用gensim实现lda，并计算perplexity（ gensim Perplexity Estimates in LDA Model）

使用gensim实现lda，并计算perplexity（ gensim Perplexity Estimates in LDA Model） Neither. The values coming out of bound() depend on the number of topics (as well as number of words), so they’re not comparable...
复制链接

扫一扫