2.5 所有路径的总得分
上节中,我们介绍了如何计算一个路径的标签得分 e S i e^{S_i} eSi,那么,还有一个问题需要解决,即怎么计算所有路径的总得分: P t o t a l = P 1 + P 2 + … + P N = e S 1 + e S 2 + … + e S N P_{total} = P_1 + P_2 + … + P_N = e^{S_1} + e^{S_2} + … + e^{S_N} Ptotal=P1+P2+…+PN=eS1+eS2+…+eSN
计算总得分最简单的方法就是:枚举所有可能的路径,然后将每条路径的得分加起来,这样虽然能实现目标,但是该方式是非常低效的,训练可能需要很久很久…
在学习下边的内容前,强烈建议你拿出草稿纸和笔,跟着示例步骤,一步一步往下走,用这种方式肯定能帮助你更好地理解算法细节,从而,你可以使用你喜欢的编程语言进行实现。
Step 1:回顾CRF损失函数
在2.3节(上一片博客)中,CRF损失函数定义如下:
L
o
s
s
F
u
n
c
t
i
o
n
=
P
R
e
a
l
P
a
t
h
P
1
+
P
2
+
.
.
.
+
P
N
LossFunction=\frac{P_{RealPath}}{P_1+P_2+...+P_N}
LossFunction=P1+P2+...+PNPRealPath
为了便于计算,我们将该损失函数改写为log损失函数,如下:
L
o
g
L
o
s
s
F
u
n
c
t
i
o
n
=
l
o
g
P
R
e
a
l
P
a
t
h
P
1
+
P
2
+
.
.
.
+
P
N
LogLossFunction=log\frac{P_{RealPath}}{P_1+P_2+...+P_N}
LogLossFunction=logP1+P2+...+PNPRealPath
因为在训练模型时,我们通常是最小化损失函数,因此我们在上式乘以-1:
L
o
g
L
o
s
s
F
u
n
c
t
i
o
n
LogLossFunction
LogLossFunction
=
−
l
o
g
P
R
e
a
l
P
a
t
h
P
1
+
P
2
+
.
.
.
+
P
N
=-log\frac{P_{RealPath}}{P_1+P_2+...+P_N}
=−logP1+P2+...+PNPRealPath
=
−
l
o
g
e
S
R
e
a
l
P
a
t
h
e
S
1
+
e
S
2
+
.
.
.
+
e
S
N
=-log\frac{e^{S_{RealPath}}}{e^{S_1}+e^{S_2}+...+e^{S_N}}
=−logeS1+eS2+...+eSNeSRealPath
=
−
(
l
o
g
(
e
S
R
e
a
l
P
a
t
h
)
−
l
o
g
(
e
S
1
+
e
S
2
+
.
.
.
+
e
S
N
)
=-(log(e^{S_{RealPath}})-log(e^{S_1}+e^{S_2}+...+e^{S_N})
=−(log(eSRealPath)−log(eS1+eS2+...+eSN)
=
−
(
S
R
e
a
l
P
a
t
h
−
l
o
g
(
e
S
1
+
e
S
2
+
.
.
.
+
e
S
N
)
=-(S_{RealPath}-log(e^{S_1}+e^{S_2}+...+e^{S_N})
=−(SRealPath−log(eS1+eS2+...+eSN)
=
−
(
∑
i
=
1
N
x
i
y
i
+
∑
i
=
1
N
−
1
t
y
i
y
i
+
1
−
l
o
g
(
e
S
1
+
e
S
2
+
.
.
.
+
e
S
N
)
=-(\sum_{i=1}^{N}x_{iy_i}+\sum_{i=1}^{N-1}t_{y_iy_{i+1}}-log(e^{S_1}+e^{S_2}+...+e^{S_N})
=−(∑i=1Nxiyi+∑i=1N−1tyiyi+1−log(eS1+eS2+...+eSN)
在之前的章节中,我们已经知道了如何计算真实路径得分,现在我们需要找到一种有效的方法来计算 l o g ( e S 1 + e S 2 + . . . + e S N ) log(e^{S_1}+e^{S_2}+...+e^{S_N}) log(eS1+eS2+...+eSN)。
Step2:回顾发射和转移得分
假设我们正在基于一个长度为3的句子训练模型:
x
=
[
w
0
,
w
1
,
w
2
]
\mathbf{x} = [w_0, w_1, w_2]
x=[w0,w1,w2]
训练数据中,只有两个标签:
L
a
b
e
l
S
e
t
=
{
l
1
,
l
2
}
LabelSet = \{l_1,l_2\}
LabelSet={l1,l2}
我们从BiLSTM层获取的发射得分结果如下表所示:
l 1 \mathbf{l_1} l1 | l 2 \mathbf{l_2} l2 | |
---|---|---|
w 0 \mathbf{w_0} w0 | x 01 x_{01} x01 | x 02 x_{02} x02 |
w 1 \mathbf{w_1} w1 | x 11 x_{11} x11 | x 12 x_{12} x12 |
w 2 \mathbf{w_2} w2 | x 21 x_{21} x21 | x 22 x_{22} x22 |
x
i
j
x_{ij}
xij是
w
i
w_i
wi被标记为
l
j
l_j
lj的得分。
CRF层的转移得分如下表所示:
l 1 \mathbf{l_1} l1 | l 2 \mathbf{l_2} l2 | |
---|---|---|
l 1 \mathbf{l_1} l1 | t 11 t_{11} t11 | t 12 t_{12} t12 |
l 2 \mathbf{l_2} l2 | t 21 t_{21} t21 | t 22 t_{22} t22 |
t i j t_{ij} tij是标签从 i i i到 j j j的转移得分。
Step3:计算(备好纸和笔 again)
提示: 我们的目标是计算
log
(
e
S
1
+
e
S
2
+
…
+
e
S
N
)
\log(e^{S_1} + e^{S_2} + … + e^{S_N})
log(eS1+eS2+…+eSN)
上式的计算是一个累加过程,其思想和动态编程相似(即使你不了解动态编程,你也可以继续,我会一步一步解释示例的,但是还是强烈推荐您去学习一下动态编程算法),
w
0
w_0
w0所有路径的总得分计算后,要计算
w
0
→
w
1
w_0 → w_1
w0→w1的总得分,然后我们利用上述总得分计算
w
0
→
w
1
→
w
2
w_0 → w_1 → w_2
w0→w1→w2的得分,该得分就是我们所需要的最终得分。
在下述步骤中,你将会看到两个变量:obs和previous,previous是先前步骤的最终结果,obs是当前单词的信息。
w 0 w_0 w0:
o
b
s
=
[
x
01
,
x
02
]
obs = [x_{01}, x_{02}]
obs=[x01,x02]
p
r
e
v
i
o
u
s
=
N
o
n
e
previous = None
previous=None
如果该语句中仅有一个单词
w
0
w_0
w0,我们没有之前的词的结果,所以 previous 为 None。此外,我们也只能获取到第一个单词,其信息
o
b
s
=
[
x
01
,
x
02
]
obs = [x_{01}, x_{02}]
obs=[x01,x02],其中
x
01
x_{01}
x01 和
x
02
x_{02}
x02即上述所说的发射得分。
那么
w
0
w_0
w0所有可能路径的总得分即为:
T
o
t
a
l
S
c
o
r
e
(
w
0
)
=
log
(
e
x
01
+
e
x
02
)
TotalScore(w_0)=\log (e^{x_{01}} + e^{x_{02}})
TotalScore(w0)=log(ex01+ex02)
w 0 w_0 w0 → w 1 w_1 w1:
o
b
s
=
[
x
11
,
x
12
]
obs = [x_{11}, x_{12}]
obs=[x11,x12]
p
r
e
v
i
o
u
s
=
[
x
01
,
x
02
]
previous = [x_{01}, x_{02}]
previous=[x01,x02]
1)将 previous扩展为:
p
r
e
v
i
o
u
s
=
(
x
01
x
01
x
02
x
02
)
previous=(\begin{matrix} x_{01} & x_{01} \\ x_{02} & x_{02}\end{matrix})
previous=(x01x02x01x02)
2)将 obs 扩展为:
o
b
s
=
(
x
11
x
12
x
11
x
12
)
obs=(\begin{matrix} x_{11} & x_{12} \\ x_{11} & x_{12}\end{matrix})
obs=(x11x11x12x12)
我们之所以这样做的原始是:矩阵可以使得所有得分计算的效率更高。
3)将 previous obs和转移得分进行相加:
s
c
o
r
e
s
=
(
x
01
x
01
x
02
x
02
)
+
(
x
11
x
12
x
11
x
12
)
+
(
t
11
t
12
t
21
t
22
)
scores=(\begin{matrix} x_{01} & x_{01} \\ x_{02} & x_{02}\end{matrix})+(\begin{matrix} x_{11} & x_{12} \\ x_{11} & x_{12}\end{matrix})+(\begin{matrix} t_{11} & t_{12} \\ t_{21} & t_{22}\end{matrix})
scores=(x01x02x01x02)+(x11x11x12x12)+(t11t21t12t22)
得到
s
c
o
r
e
s
=
(
x
01
+
x
11
+
t
11
x
01
+
x
12
+
t
12
x
02
+
x
11
+
t
21
x
02
+
x
12
+
t
22
)
scores=(\begin{matrix} x_{01}+ x_{11}+t_{11} & x_{01} +x_{12}+t_{12}\\ x_{02}+x_{11}+ t_{21} & x_{02}+ x_{12}+ t_{22}\end{matrix})
scores=(x01+x11+t11x02+x11+t21x01+x12+t12x02+x12+t22)
更新 previous:
p
r
e
v
i
o
u
s
=
[
l
o
g
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
,
l
o
g
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
]
previous=[log(e^{x_{01}+x_{11}+t_{11}}+e^{x_{02}+x_{11}+t_{21}}),log(e^{x_{01}+x_{12}+t_{12}}+e^{x_{02}+x_{12}+t_{22}})]
previous=[log(ex01+x11+t11+ex02+x11+t21),log(ex01+x12+t12+ex02+x12+t22)]
这样,第二轮迭代就完成了,从
w
0
→
w
1
w_0 → w_1
w0→w1的所有路径: (
l
a
b
e
l
1
label_1
label1 →
l
a
b
e
l
1
label_1
label1,
l
a
b
e
l
1
label_1
label1 →
l
a
b
e
l
2
label_2
label2,
l
a
b
e
l
2
label_2
label2 →
l
a
b
e
l
1
label_1
label1,
l
a
b
e
l
2
label_2
label2 →
l
a
b
e
l
2
label_2
label2),其总得分可以用下式计算:
T
o
t
a
l
S
c
o
r
e
(
w
0
→
w
1
)
TotalScore(w_0 → w_1)
TotalScore(w0→w1)
=
log
(
e
p
r
e
v
i
o
u
s
[
0
]
+
e
p
r
e
v
i
o
u
s
[
1
]
)
=\log (e^{previous[0]} + e^{previous[1]})
=log(eprevious[0]+eprevious[1])
=
log
(
e
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
+
e
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
)
=\log (e^{\log(e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})}+ e^{\log(e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})} )
=log(elog(ex01+x11+t11+ex02+x11+t21)+elog(ex01+x12+t12+ex02+x12+t22))
=
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
+
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
=\log(e^{x_{01}+x_{11}+t_{11}}+e^{x_{02}+x_{11}+t_{21}}+e^{x_{01}+x_{12}+t_{12}}+e^{x_{02}+x_{12}+t_{22}})
=log(ex01+x11+t11+ex02+x11+t21+ex01+x12+t12+ex02+x12+t22)
这就是我们的目标
log
(
e
S
1
+
e
S
2
+
…
+
e
S
N
)
\log(e^{S_1} + e^{S_2} + … + e^{S_N})
log(eS1+eS2+…+eSN)。
从上式中我们可以发现
- S 1 = x 01 + x 11 + t 11 S_1 = x_{01}+x_{11}+t_{11} S1=x01+x11+t11 ( l a b e l 1 label_1 label1 → l a b e l 1 label_1 label1)
- S 2 = x 02 + x 11 + t 21 S_2 = x_{02}+x_{11}+t_{21} S2=x02+x11+t21 ( l a b e l 2 label_2 label2 → l a b e l 1 label_1 label1)
- S 3 = x 01 + x 12 + t 12 S_3 = x_{01}+x_{12}+t_{12} S3=x01+x12+t12 ( l a b e l 1 label_1 label1 → l a b e l 2 label_2 label2)
- S 4 = x 02 + x 12 + t 22 S_4 = x_{02}+x_{12}+t_{22} S4=x02+x12+t22 ( l a b e l 2 label_2 label2 → l a b e l 2 label_2 label2)
w
0
w_0
w0 →
w
1
w_1
w1 →
w
2
w_2
w2:
当你读到这里的时候,其实你已经学会了,在这次迭代中,我们知道按照上述的方式进行相同操作即可
o
b
s
=
[
x
21
,
x
22
]
obs = [x_{21}, x_{22}]
obs=[x21,x22]
p
r
e
v
i
o
u
s
=
[
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
,
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
]
previous=[\log (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}}), \log (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})]
previous=[log(ex01+x11+t11+ex02+x11+t21),log(ex01+x12+t12+ex02+x12+t22)]
1)将previous扩展为:
p
r
e
v
i
o
u
s
=
(
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
)
previous=\left( \begin{matrix}\log (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}}) & \log (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})\\\log (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})&\log (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})\end{matrix}\right)
previous=(log(ex01+x11+t11+ex02+x11+t21)log(ex01+x12+t12+ex02+x12+t22)log(ex01+x11+t11+ex02+x11+t21)log(ex01+x12+t12+ex02+x12+t22))
2)将obs扩展为:
o
b
s
=
(
x
21
x
22
x
21
x
22
)
obs = \left( \begin{matrix} x_{21} & x_{22}\\ x_{21} & x_{22} \end{matrix}\right)
obs=(x21x21x22x22)
3)将 previous,obs和转移得分加起来:
s
c
o
r
e
s
=
(
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
)
+
(
x
21
x
22
x
21
x
22
)
+
(
t
11
t
12
t
21
t
22
)
scores=\left( \begin{matrix}\log (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}}) & \log (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})\\\log (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})&\log (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})\end{matrix}\right)+\left( \begin{matrix} x_{21} & x_{22}\\ x_{21} & x_{22} \end{matrix}\right)+(\begin{matrix} t_{11} & t_{12} \\ t_{21} & t_{22}\end{matrix})
scores=(log(ex01+x11+t11+ex02+x11+t21)log(ex01+x12+t12+ex02+x12+t22)log(ex01+x11+t11+ex02+x11+t21)log(ex01+x12+t12+ex02+x12+t22))+(x21x21x22x22)+(t11t21t12t22)
然后:
s
c
o
r
e
s
=
(
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
+
x
21
+
t
11
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
+
x
22
+
t
12
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
+
x
21
+
t
21
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
+
x
22
+
t
22
)
scores=\left( \begin{matrix}\log (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})+x_{21}+ t_{11} & \log (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})+x_{22}+t_{12}\\\log (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})+x_{21}+t_{21}&\log (e^{x_{01}+x_{12}+t_{12}} +e^{x_{02}+x_{12}+t_{22}})+ x_{22}+ t_{22}\end{matrix}\right)
scores=(log(ex01+x11+t11+ex02+x11+t21)+x21+t11log(ex01+x12+t12+ex02+x12+t22)+x21+t21log(ex01+x11+t11+ex02+x11+t21)+x22+t12log(ex01+x12+t12+ex02+x12+t22)+x22+t22)
更新previous:
p
r
e
v
i
o
u
s
=
[
previous =[
previous=[
log
e
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
+
x
21
+
t
11
+
e
log
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
+
x
22
+
t
12
)
,
log
(
e
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
+
x
21
+
t
21
+
e
log
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
+
x
22
+
t
22
)
\begin{matrix} \log e^{\log (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})+x_{21}+ t_{11}}+e^{\log (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})+x_{22}+t_{12}}),\\\log(e^{\log (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})+x_{21}+t_{21}}+e^{\log (e^{x_{01}+x_{12}+t_{12}} +e^{x_{02}+x_{12}+t_{22}})+ x_{22}+ t_{22}})\end{matrix}
logelog(ex01+x11+t11+ex02+x11+t21)+x21+t11+elog(ex01+x11+t11+ex02+x11+t21)+x22+t12),log(elog(ex01+x12+t12+ex02+x12+t22)+x21+t21+elog(ex01+x12+t12+ex02+x12+t22)+x22+t22)
]
]
]
=
[
log
(
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
e
x
21
+
t
11
+
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
e
x
21
+
t
21
)
,
=[\log( (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})e^{x_{21} + t_{11}} + (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})e^{x_{21} + t_{21}} ),
=[log((ex01+x11+t11+ex02+x11+t21)ex21+t11+(ex01+x12+t12+ex02+x12+t22)ex21+t21),
log
(
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
e
x
22
+
t
12
+
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
e
x
22
+
t
22
)
]
\log( (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})e^{x_{22} + t_{12}} + (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})e^{x_{22} + t_{22}})]
log((ex01+x11+t11+ex02+x11+t21)ex22+t12+(ex01+x12+t12+ex02+x12+t22)ex22+t22)]
如上所述,我们使用新更新的previous来计算总得分:
T
o
t
a
l
S
c
o
r
e
(
w
0
→
w
1
→
w
2
)
TotalScore(w_0 → w_1 → w_2)
TotalScore(w0→w1→w2)
=
log
(
e
p
r
e
v
i
o
u
s
[
0
]
+
e
p
r
e
v
i
o
u
s
[
1
]
)
=\log (e^{previous[0]} + e^{previous[1]})
=log(eprevious[0]+eprevious[1])
=
log
(
e
log
(
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
e
x
21
+
t
11
+
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
e
x
21
+
t
21
)
=\log (e^{\log( (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})e^{x_{21} + t_{11}} + (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})e^{x_{21} + t_{21}} )}
=log(elog((ex01+x11+t11+ex02+x11+t21)ex21+t11+(ex01+x12+t12+ex02+x12+t22)ex21+t21)
+
e
log
(
(
e
x
01
+
x
11
+
t
11
+
e
x
02
+
x
11
+
t
21
)
e
x
22
+
t
12
+
(
e
x
01
+
x
12
+
t
12
+
e
x
02
+
x
12
+
t
22
)
e
x
22
+
t
22
)
)
+e^{\log( (e^{x_{01}+x_{11}+t_{11}} + e^{x_{02}+x_{11}+t_{21}})e^{x_{22} + t_{12}} + (e^{x_{01}+x_{12}+t_{12}} + e^{x_{02}+x_{12}+t_{22}})e^{x_{22} + t_{22}})} )
+elog((ex01+x11+t11+ex02+x11+t21)ex22+t12+(ex01+x12+t12+ex02+x12+t22)ex22+t22))
=
log
(
e
x
01
+
x
11
+
t
11
+
x
21
+
t
11
+
e
x
02
+
x
11
+
t
21
+
x
21
+
t
11
=\log (e^{x_{01}+x_{11}+t_{11}+x_{21}+t_{11}}+e^{x_{02}+x_{11}+t_{21}+x_{21}+t_{11}}
=log(ex01+x11+t11+x21+t11+ex02+x11+t21+x21+t11
+
e
x
01
+
x
12
+
t
12
+
x
21
+
t
21
+
e
x
02
+
x
12
+
t
22
+
x
21
+
t
21
+e^{x_{01}+x_{12}+t_{12}+x_{21}+t_{21}}+e^{x_{02}+x_{12}+t_{22}+x_{21}+t_{21}}
+ex01+x12+t12+x21+t21+ex02+x12+t22+x21+t21
+
e
x
01
+
x
11
+
t
11
+
x
22
+
t
12
+
e
x
02
+
x
11
+
t
21
+
x
22
+
t
12
+e^{x_{01}+x_{11}+t_{11}+x_{22}+t_{12}}+e^{x_{02}+x_{11}+t_{21}+x_{22}+t_{12}}
+ex01+x11+t11+x22+t12+ex02+x11+t21+x22+t12
+
e
x
01
+
x
12
+
t
12
+
x
22
+
t
22
+
e
x
02
+
x
12
+
t
22
+
x
22
+
t
22
)
+e^{x_{01}+x_{12}+t_{12}+x_{22}+t_{22}}+e^{x_{02}+x_{12}+t_{22}+x_{22}+t_{22}})
+ex01+x12+t12+x22+t22+ex02+x12+t22+x22+t22)
撒花
我们已经得到了目标:
log
(
e
S
1
+
e
S
2
+
…
+
e
S
N
)
\log(e^{S_1} + e^{S_2} + … + e^{S_N})
log(eS1+eS2+…+eSN),因为该示例中,这一句话中有3个词,有两个标签,因此,总共有8个可能的路径。
虽然该过程看起来比较复杂,但是用编程语言实现该算法是很简单的,因为计算机比较擅长处理重复性工作。
下一篇博文
下篇博客将介绍如何预测一句话的标签。