•   吴军老师的的《数学之美》
  • 《统计自然语言处理(第2版)》(宗成庆)蓝皮版
  • 《统计学习方法》(李航)
  • 《自然语言处理简明教程》(冯志伟)
  • 《自然语言处理综论》(Daniel Jurafsky)
  • 《自然语言处理的形式模型》(冯志伟)






  • Journal of Machine Learning Research
  • Computational Linguistics(URL:
  • Machine Learning
  • AAAI
  • Artificial Intelligence
  • Journal of Artificial Intelligence Research


  1. pattern - simpler to get started than NLTK
  2. chardet - character encoding detection
  3. pyenchant - easy access to dictionaries
  4. scikit-learn - has support for text classification
  5. unidecode - because ascii is much easier to deal with


  • Parsing(句法结构分析~语言学知识多,会比较枯燥)
  1. Klein & Manning: "Accurate Unlexicalized Parsing" (克莱因与曼宁:“精确非词汇化句法分析” )
  2. Klein & Manning: "Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency" (革命性的用非监督学习的方法做了parser)
  3. Nivre "Deterministic Dependency Parsing of English Text" (shows that deterministic parsing actually works quite well)
  4. McDonald et al. "Non-Projective Dependency Parsing using Spanning-Tree Algorithms" (the other main method of dependency parsing, MST parsing)
  • Machine Translation(机器翻译,如果不做机器翻译就可以跳过了,不过翻译模型在其他领域也有应用)
  1. Knight "A statistical MT tutorial workbook" (easy to understand, use instead of the original Brown paper)
  2. Och "The Alignment-Template Approach to Statistical Machine Translation" (foundations of phrase based systems)
  3. Wu "Inversion Transduction Grammars and the Bilingual Parsing of Parallel Corpora" (arguably the first realistic method for biparsing, which is used in many systems)
  4. Chiang "Hierarchical Phrase-Based Translation" (significantly improves accuracy by allowing for gappy phrases)
  • Language Modeling (语言模型)
  1. Goodman "A bit of progress in language modeling" (describes just about everything related to n-gram language models 这是一个survey,这个survey写了几乎所有和n-gram有关的东西,包括平滑 聚类)
  2. Teh "A Bayesian interpretation of Interpolated Kneser-Ney" (shows how to get state-of-the art accuracy in a Bayesian framework, opening the path for other applications)
  • Machine Learning for NLP
  1. Sutton & McCallum "An introduction to conditional random fields for relational learning" (CRF实在是在NLP中太好用了!!!!!而且我们大家都知道有很多现成的tool实现这个,而这个就是一个很简单的论文讲述CRF的,不过其实还是蛮数学= =。。。)
  2. Knight "Bayesian Inference with Tears" (explains the general idea of bayesian techniques quite well)
  3. Berg-Kirkpatrick et al. "Painless Unsupervised Learning with Features" (this is from this year and thus a bit of a gamble, but this has the potential to bring the power of discriminative methods to unsupervised learning)
  • Information Extraction
  1. Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora. COLING 1992. (The very first paper for all the bootstrapping methods for NLP. It is a hypothetical work in a sense that it doesn't give experimental results, but it influenced it's followers a lot.)
  2. Collins and Singer. Unsupervised Models for Named Entity Classification. EMNLP 1999. (It applies several variants of co-training like IE methods to NER task and gives the motivation why they did so. Students can learn the logic from this work for writing a good research paper in NLP.)
  • Computational Semantics
  1. Gildea and Jurafsky. Automatic Labeling of Semantic Roles. Computational Linguistics 2002. (It opened up the trends in NLP for semantic role labeling, followed by several CoNLL shared tasks dedicated for SRL. It shows how linguistics and engineering can collaborate with each other. It has a shorter version in ACL 2000.)
  2. Pantel and Lin. Discovering Word Senses from Text. KDD 2002. (Supervised WSD has been explored a lot in the early 00's thanks to the senseval workshop, but a few system actually benefits from WSD because manually crafted sense mappings are hard to obtain. These days we see a lot of evidence that unsupervised clustering improves NLP tasks such as NER, parsing, SRL, etc,




发布了311 篇原创文章 · 获赞 214 · 访问量 26万+


©️2019 CSDN 皮肤主题: 黑客帝国 设计师: 上身试试