LDA主题模型中coherence(一致性)报错得出来为nan解决办法

参考链接: https://www.codenong.com/60246570/

报错原因:D:\software\Anaconda\envs\LDA\lib\site-packages\gensim\topic_coherence\direct_confirmation_measure.py:204: RuntimeWarning: divide by zero encountered in double_scalars
m_lr_i = np.log(numerator / denominator)
D:\software\Anaconda\envs\LDA\lib\site-packages\gensim\topic_coherence\indirect_confirmation_measure.py:323: RuntimeWarning: invalid value encountered in double_scalars
return cv1.T.dot(cv2)[0, 0] / (_magnitude(cv1) * _magnitude(cv2))
nan

数据准备

     dataAll, data = data_dispose.loaddata()
    # print(dataAll.content_cutted)
    train = []
    for line in dataAll.content_cutted:
        line = [word.strip() for word in line.split(' ')]
        train.append(line)
    print(type(train))
    # exit()
    dictionary = corpora.Dictionary(train)
    corpus = [dictionary.doc2bow(test) for test in train]

修改前

    def coherence(num_topics):
        lda = models.LdaModel(corpus=corpus, id2word=dictionary, num_topics=num_topics, passes=60, alpha=5, eta=0.01,
                              random_state=1)
        print(lda.print_topics(num_topics=num_topics, num_words=10))
        ldacm = models.CoherenceModel(model=lda, texts=corpus, dictionary=dictionary, coherence='c_v')
        print(ldacm.get_coherence())
        return ldacm.get_coherence()

修改完(将CoherenceModel函数中的texts参数值从corpus修改为train)

    def coherence(num_topics):
        lda = models.LdaModel(corpus=corpus, id2word=dictionary, num_topics=num_topics, passes=60, alpha=5, eta=0.01,
                              random_state=1)
        print(lda.print_topics(num_topics=num_topics, num_words=10))
        ldacm = models.CoherenceModel(model=lda, texts=train, dictionary=dictionary, coherence='c_v')
        print(ldacm.get_coherence())
        return ldacm.get_coherence()

错误原因:一致性模型需要原始文本,而不是输入到LDA_Model的训练语料库

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值