A Part-Of-Speech Tagger (POS Tagger)
is a piece of software that reads text in some language and assigns
parts of speech to each word (and other token), such as noun, verb,
adjective, etc., although generally computational applications use
more fine-grained POS tags like 'noun-plural'. This software is a
Java implementation of the log-linear part-of-speech taggers
described in these papers (if citing just one paper, cite the 2003
one):
词性标注器用于分析句中每个词的词性,如名词,动词,形容词等。而很多应用程序采用更多细粒度的标注,如名词-复数等。该软件采用java实现的线性词性标注器,并且在下列论文中引用。
Kristina Toutanova and Christopher D.
Manning. 2000. Enriching
the Knowledge Sources Used in a Maximum Entropy Part-of-Speech
Tagger. In Proceedings of the Joint SIGDAT
Conference on Empirical Methods in Natural Language Processing and
Very Large Corpora (EMNLP/VLC-20