NLP开源软件

一、分词

1、  ICTCLAS

http://www.ictclas.org/包含分词、词性标注功能, C++编写,提供Java借口,业界比较出名。

2、  Ansj中文分词

http://www.ansj.org/    分词 词性等  Java,为ICTCLAS重新实现版本

 

以下三个为Lucene提供的中文分词模块

3、  IKAnalyzer

http://code.google.com/p/ik-analyzer/  Java编写

4、  paoding

http://code.google.com/p/paoding/  Java

5、  imdict-chinese-analyzer

http://code.google.com/p/imdict-chinese-analyzer/   Java   HHMM分词模型

6、  Stanford Word Segmenter

http://nlp.stanford.edu/software/segmenter.shtml

 

二、词性标注

1、  Stanford POS Tagger

 http://nlp.stanford.edu/software/tagger.shtml

2、  TreeTagger

 http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/

3、  TnT

http://www.coli.uni-saarland.de/~thorsten/tnt/

4、  ICTCLAS支持中文词性标注

 

三、句法分析

Stanford Parserhttp://nlp.stanford.edu/software/lex-parser.shtml

Berkeley Parser http://nlp.cs.berkeley.edu/Main.html#Parsing

Charniak Parser http://www.cs.brown.edu/~ec/

 

依存分析

Stanford Parserhttp://nlp.stanford.edu/software/lex-parser.shtml

MSTparser http://www.ryanmcd.com/MSTParser/MSTParser.html

MaltParser  http://www.maltparser.org/

四、命名实体识别

Stanford NER  http://nlp.stanford.edu/software/CRF-NER.shtml

五、语义角色标注

Illinois Semantic Role Labeler (SRL) http://cogcomp.cs.illinois.edu/page/software_view/SRL

六、综合应用

1、  LTP http://ir.hit.edu.cn/ltp/ 

哈工大语言技术平台,LTP制定了基于XML的语言处理结果表示,并在此基础上提供了一整套自底向上的丰富而且高效的中文语言处理模块(包括词法、句法、语义等6项中文处理核心技术),以及基于动态链接库(Dynamic Link Library, DLL)的应用程序接口,可视化工具,并且能够以网络服务(Web Service)的形式进行使用。

包括分词、词性标注、命名实体识别、依存句法分析、语义角色标注等模块,C++编写

2、  FudanNLP  http://code.google.com/p/fudannlp/ 

Java编写

信息检索: 文本分类 新闻聚类

中文处理: 中文分词 词性标注 实体名识别 关键词抽取 依存句法分析 时间短语识别

结构化学习: 在线学习 层次分类 聚类 精确推理

3、  Stanford CoreNLP  http://nlp.stanford.edu/software/corenlp.shtml

包括词性标注、命名实体识别、句法分析和指代消解功能

4、ClearNLP  https://code.google.com/p/clearnlp/

This project provides several NLP tools such as a dependency parser,a semantic role labeler, a penn-to-dependency converter, a prop-to-dependencyconverter, and a morphological analyzer.

All tools are written in Java and developed by the ComputationalLanguage and EducAtion Research (CLEAR) group at the University of Colorado atBoulder.

 

cleartk    http://code.google.com/p/cleartk/

 

  

  • 0
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
TreeTagger文本标注 附录二 TreeTagger 赋码集 (TreeTagger tagset) CC Coordinating conjunction CD Cardinal number DT Article and determiner EX Existential there FW Foreign word IN Preposition or subordinating conjunction JJ Adjective JJR Comparative adjective JJS Superlative adjective LS List item marker MD Modal verb NN Common noun, singular or mass NNS Common noun, plural NP Proper noun, singular NPS Proper noun, plural PDT Predeterminer POS Possessive ending PP Personal pronoun PP$ Possessive pronoun RB Adverb RBR Comparative adverb RBS Sup erlative adverb RP Particle SYM Symbol TO to UH Exclamation or interjection VB BE verb, base form (be) VBD Past tense verb of BE (was, were) VBG Gerund or present participle of BE verb (being) VBN Past participle of BE verb (been) VBP Present tense (other than 3rd person singular) of BE verb (am, are) VBZ Present tense (3rd person singular) of BE verb (is) VD DO verb, base form (do) VDD Past tense verb of DO (did) VDG Gerund or present participle of DO verb (doing) VDN Past participle of DO verb (done) VDP Present tense (other than 3rd person singular) of DO verb (do) VDZ Present tense (3rd person singular) of DO verb (does) VH HAVE verb, base form (have) VHD Past tense verb of HAVE (had) VHG Gerund or present participle of HAVE verb (having) VHN Past participle of HAVE verb (had) VHP Present tense (other than 3rd person singular) of HAVE verb (have) VHZ Present tense (3rd person singular) of HAVE verb (has) VV Lexical verb, base form (e.g. live) VVD Past tense verb of lexical verb (e.g. lived) VVG Gerund or present participle of lexical verb (living) VVN Past participle of lexical verb (lived, shown) VVP Present tense (other than 3rd person singular) of lexical verb (live) VVZ Present tense (3rd person singular) of lexical verb (lives) WDT Wh-determiner WP Wh-pronoun WP$ Possessive wh-pronoun WRB Wh-adverb

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值