来源:Cousera哥大自然语言处理课程
一阶马尔科夫过程
P(X1=x1,X2=x2,...,Xn=xn)=P(X1=x1)∑i=2nP(Xi|X1,X2,...,Xi−1)=P(X1=x1)∑i=2nP(Xi|Xi−1)
二阶马尔科夫过程
P(X1=x1,X2=x2,...,Xn=xn)=P(X1=x1)P(X2=x2|X1=x1)∑i=3nP(Xi|Xi−2=xi−2,Xi−1=xi−1)=∑i=1nP(Xi|Xi−2=xi−2,Xi−1=xi−1)
设 X−1=X0=∗ ,表示开始标志
三元语言模型
The Trigram Estimation Problem
混淆度(Perplexity)
Perplexity is a measure of effective “branching factor”
偏差&方差权衡
Why
- Unigram & Bigram converge quickly to its true underlying value.
- Trigram has low bias, but need large datasets to get an accurate estimate to avoid “ZERO”.