1 RNN的演变
语言建模是预测下一个单词的任务,即给出一个单词序列,计算下一个单词
的概率分布:
1.1 n-gram语言模型
the students opened their ______
• unigrams: “the”, “students”, “opened”, ”their”
• bigrams: “the students”, “students opened”, “opened their”
• trigrams: “the students opened”, “students opened their”
• 4-grams: “the students opened their”
Idea:收集关于不同的n-gram的统计数据,并使用这些来预测下一个单词。
Example:
假设我们正在学习一个4-gram的语言模型。
as the proctor started the clock, the students opened their _____
In the corpus:
• "students opened their" occurred 1000 times
• "students opened their books" occurred 400 times
• P(books | students opened their) = 0.4
• "students opened their exams" occurred 100 times
• P(exams | students opened their) = 0.1
Should we have discarded the “proctor” context?
存在的问题:
- 稀疏性
“students opened their ”出现的次数为0----->添加小的