转载请注明来源:http://blog.csdn.net/yihucha166/article/details/9046835
Latent Dirichlet Allocation(LDA)是目前业界最为流行的机器学习方法之一,这里用C++实现了一个as-lda版本,使用了非对称的先验设置,随着主题数的增加,主题分布上比传统模型更加稳定,减少因为主题数量大而导致大量小众主题,参考文献《Rethinking LDA:Why Priors Matter》,代码目录中包含了中文测试数据
代码地址:https://code.google.com/p/as-lda/
asymmetric prior Latent Dirichlet Allocation (LDA) by c++
Usually, symmetric dirichlet prior is used in the implementation of lda. in "Rethinking LDA:Why Priors Matter" , they have showed that asymmetric prior can generate better result and stable topic distribution under the increment of topic number. So, in this project, we adopt this algorithm.
other features:
#easy to use, easy to understa