题目如下:
根据下图例子,计算下面机器翻译译文的BLEU-4值 (列出1-4元文法的统计数)。
解:由题意得
1
−
4
元的统计数如下表所示:
1-4 元的统计数如下表所示:
1−4元的统计数如下表所示:
1-gram
count
ref
count-clip
it
1
0
0
is
1
1
1
a
1
1
1
nice
1
1
1
day
1
1
1
today
1
1
1
\begin{array}{|c|c|c|c|} \hline \text{1-gram} & \text{count} & \text{ref} & \text{count-clip} \\ \hline \text{it} & 1 & 0 & 0 \\ \hline \text{is} & 1 & 1 & 1 \\ \hline \text{a} & 1 & 1 & 1 \\ \hline \text{nice} & 1 & 1 & 1 \\ \hline \text{day} & 1 & 1 & 1 \\ \hline \text{today} & 1 & 1 & 1 \\ \hline \end{array}
1-gramitisanicedaytodaycount111111ref011111count-clip011111
2-gram count ref count-clip it is 1 0 0 is a 1 1 1 a nice 1 1 1 nice day 1 1 1 day today 1 0 0 \begin{array}{|c|c|c|c|} \hline \text{2-gram} & \text{count} & \text{ref} & \text{count-clip} \\ \hline \text{it is} & 1 & 0 & 0 \\ \hline \text{is a} & 1 & 1 & 1 \\ \hline \text{a nice} & 1 & 1 & 1 \\ \hline \text{nice day} & 1 & 1 & 1 \\ \hline \text{day today} & 1 & 0 & 0 \\ \hline \end{array} 2-gramit isis aa nicenice dayday todaycount11111ref01110count-clip01110
3-gram count ref count-clip it is a 1 0 0 is a nice 1 1 1 a nice day 1 1 1 nice day today 1 0 0 \begin{array}{|c|c|c|c|} \hline \text{3-gram} & \text{count} & \text{ref} & \text{count-clip} \\ \hline \text{it is a} & 1 & 0 & 0 \\ \hline \text{is a nice} & 1 & 1 & 1 \\ \hline \text{a nice day} & 1 & 1 & 1 \\ \hline \text{nice day today} & 1 & 0 & 0 \\ \hline \end{array} 3-gramit is ais a nicea nice daynice day todaycount1111ref0110count-clip0110
4-gram count ref count-clip it is a nice 1 0 0 is a nice day 1 1 1 a nice day today 1 0 0 \begin{array}{|c|c|c|c|} \hline \text{4-gram} & \text{count} & \text{ref} & \text{count-clip} \\ \hline \text{it is a nice} & 1 & 0 & 0 \\ \hline \text{is a nice day} & 1 & 1 & 1 \\ \hline \text{a nice day today} & 1 & 0 & 0 \\ \hline \end{array} 4-gramit is a niceis a nice daya nice day todaycount111ref010count-clip010
由表可得
p
1
=
5
6
,
p
2
=
3
5
,
p
3
=
2
4
,
p
4
=
1
3
p_1 = \frac56 , p_2 = \frac35 , p_3 = \frac24 , p_4 = \frac13
p1=65,p2=53,p3=42,p4=31
易知
c
=
6
,
r
=
5
c = 6,r = 5
c=6,r=5
即
c
>
r
c > r
c>r
故
B
P
=
1
BP = 1
BP=1
画表如下:
Translation
p
1
p
2
p
3
p
4
BP
人工译文:
Today is a nice day
机器译文:
It is a nice day today
5
6
3
5
2
4
1
3
1
‾
\overline{\begin{array}{lllllll} &\text{Translation}&p_1&p_2&p_3&p_4&\text{BP}\\ \hline\text{人工译文:}&\text{Today is a nice day}&&&&&\\ \text{机器译文:}&\text{It is a nice day today}&\frac56&\frac35&\frac24&\frac13&1 \end{array}}
人工译文:机器译文:TranslationToday is a nice dayIt is a nice day todayp165p253p342p431BP1
因为
N
=
4
N = 4
N=4 ,故
w
n
=
1
4
w_n = \frac14
wn=41
所以
BLEU-4
=
B
P
∗
exp
(
∑
n
=
1
N
w
n
l
o
g
P
n
)
=
1
×
exp
(
1
4
×
(
log
(
p
1
)
+
log
(
p
2
)
+
log
(
p
3
)
+
l
o
g
(
p
4
)
)
)
=
exp
(
1
4
×
log
(
p
1
p
2
p
3
p
4
)
)
=
exp
(
1
4
×
l
o
g
(
5
6
×
3
5
×
2
4
×
1
3
)
)
=
exp
(
1
4
×
l
o
g
(
1
12
)
)
≈
0.537
\begin{aligned} \text{BLEU-4}&=\mathrm{BP}*\exp(\sum_{n=1}^Nw_n\mathrm{log}P_n) \\ &= 1 \times \exp(\frac14 \times (\log(p_1) + \log(p_2) + \log(p_3) + log(p_4))) \\ &= \exp(\frac14 \times \log(p_1p_2p_3p_4)) \\ &= \exp(\frac14 \times log(\frac56 \times \frac35 \times \frac24 \times \frac13)) \\ &= \exp(\frac14 \times log(\frac{1}{12})) \\ &\approx 0.537 \end{aligned}
BLEU-4=BP∗exp(n=1∑NwnlogPn)=1×exp(41×(log(p1)+log(p2)+log(p3)+log(p4)))=exp(41×log(p1p2p3p4))=exp(41×log(65×53×42×31))=exp(41×log(121))≈0.537
倘若大佬发现什么错误,敬请斧正,感谢感谢!