mallet是Umass大牛开发的一个关于统计自然语言处理的l的开源库。
http://mallet.cs.umass.edu/index.php
1.topic modeling
LDA
2.sequence tagging
hidden Markov models (HMMs) and linear chain conditional random fields (CRFs)、GRMM等
3.document classification
默认支持Naïve Bayes
另外还有这些算法可供选择(类名):
AdaBoostM2Trainer, AdaBoostTrainer, BaggingTrainer, BalancedWinnowTrainer,
C45Trainer, ClassifierEnsembleTrainer, ConfidencePredictingClassifierTrainer,
DecisionTreeTrainer,FeatureSelectingClassifierTrainer, MaxEntGERangeTrainer,
MaxEntGETrainer, MaxEntPRTrainer, MaxEntTrainer, MCMaxEntTrainer,
NaiveBayesEMTrainer, NaiveBayesTrainer, WinnowTrainer