介绍: 作为论坛的版主,肩负的任务之一就是维护论坛发言的质量,删除广告贴,灌水贴 垃圾贴等等. 本系统的开发目的就是为减轻版主的工作负担,自动识别垃圾贴的一个演示系统。 理论依据是朴素贝叶斯原理. 使用的过程如下: 1、首先在多么乐注册帐号,登陆系统。 2、录入训练系统的原始数据,分两类垃圾贴 和 非垃圾贴。 3、录入需要检测的帖子,查看帖子是垃圾贴的百分比。 |
欢迎一起 讨论 完善这个程序.
微软亚洲研究院-自然语言计算组
- 信息检索的依存语言模型
Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu and Guihong Cao."Dependence language model for information retrieval", In SIGIR-2004. Sheffield, UK, July 25-29, 2004. - 一种英-汉命名实体对齐的新方法
Dong-Hui Feng, Ya-Juan Lv, Ming Zhou,"A New Approach for English-Chinese Named Entity Alignment", 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, Jul. 2004. - 基于单语语料库的搭配翻译自动获取
Ya-Juan Lv,Ming Zhou,"Collocation Translation Acquisition Using Monolingual Corpora", 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, Jul. 2004. - 可适应性的中文分词
Jianfeng Gao, Andi Wu, Mu Li, Chang-Ning Huang, Hongqiao Li, Xinsong Xia and Haowei Qin."Adaptive Chinese word segmentation" , 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, Jul. 2004. - 采用支持向量机识别中文新词
Hongqiao Li, Chang-Ning Huang, Jianfeng Gao and Xiaozhong Fan, "The use of SVM for Chinese new word identification", In IJCNLP-04. Sanya City, Hainan Island, China, March 22-24, 2004. - 语言模型中获取长距离依存的经验探讨
Jianfeng Gao and Hisami Suzuki,"Capturing long distance dependency for language modeling: an empirical study", In IJCNLP-04. Sanya City, Hainan Island, China, March 22-24, 2004. - Word Translation Disambiguation Using Bilingual Bootstrapping
Hang Li and Cong Li," Word Translation Disambiguation Using Bilingual Bootstrapping", Computational Linguistics 30(1), 1-22, 2004. - Text Classification Using Stochastic Keyword Generation
Cong Li, Ji-Rong Wen, and Hang Li, "Text Classification Using Stochastic Keyword Generation", Proc. of ICML'03, 464-471. - Uncertainty Reduction in Collaborative Bootstrapping: Measure and Algorithm
Yunbo Cao, Hang Li, and Li Lian, "Uncertainty Reduction in Collaborative Bootstrapping: Measure and Algorithm", Proc. of ACL'03, 327-334. - 改进的信源-信道模型在中文分词中的应用
Ya-JJianfeng Gao, Mu Li and Chang-Ning Huang, "Improved Source-Channel Models for Chinese Word Segmentation", 41nd Annual Meeting of the Association for Computational Linguistics. Sapporo. Japan, July 7-12, 2003. - Topic Analysis Using a Finite Mixture Model
Hang Li and Kenji Yamanishi, "Topic Analysis Using a Finite Mixture Model", Information Processing & Management, 39(4), 521-541, (2003). - Using Bilingual Web Data to Mine and Rank Translations
Hang Li, Yunbo Cao, and Cong Li,"Using Bilingual Web Data to Mine and Rank Translations", IEEE Intelligent Systems, Vol. 18(4), 54-59, (2003)
发表于 @
2005年03月14日 23:44:00 | | 编辑|
举报| 收藏