python分数等级时为什么语法错误_python – NLTK:语料级别的bleu与句子级别的BLEU得分...

TL; DR:

>>> import nltk

>>> hypothesis = ['This', 'is', 'cat']

>>> reference = ['This', 'is', 'a', 'cat']

>>> references = [reference] # list of references for 1 sentence.

>>> list_of_references = [references] # list of references for all sentences in corpus.

>>> list_of_hypotheses = [hypothesis] # list of hypotheses that corresponds to list of references.

>>> nltk.translate.bleu_score.corpus_bleu(list_of_references, list_of_hypotheses)

0.6025286104785453

>>> nltk.translate.bleu_score.sentence_bleu(references, hypothesis)

0.6025286104785453

(注意:您必须在开发分支上提取最新版本的NLTK才能获得稳定版本的BLEU分数实现)

在龙:

实际上,如果整个语料库中只有一个引用和一个假设,则corpus_bleu()和sentence_bleu()都应该返回与上面示例中所示相同的值.

def sentence_bleu(references, hypothesis, weights=(0.25, 0.25, 0.25, 0.25),

smoothing_function=None):

return corpus_bleu([references], [hypothesis], weights, smoothing_function)

如果我们查看sentence_bleu的参数:

def sentence_bleu(references, hypothesis, weights=(0.25, 0.25, 0.25, 0.25),

smoothing_function=None):

""""

:param references: reference sentences

:type references: list(list(str))

:param hypothesis: a hypothesis sentence

:type hypothesis: list(str)

:param weights: weights for unigrams, bigrams, trigrams and so on

:type weights: list(float)

:return: The sentence-level BLEU score.

:rtype: float

"""

sentence_bleu引用的输入是一个列表(list(str)).

所以如果你有一个句子字符串,例如“这是一只猫”,你必须将其标记为获取字符串列表,[“This”,“is”,“a”,“cat”]并且因为它允许多个引用,所以它必须是一个列表字符串列表,例如如果你有第二个参考,“这是猫科动物”,你对sentence_bleu()的输入将是:

references = [ ["This", "is", "a", "cat"], ["This", "is", "a", "feline"] ]

hypothesis = ["This", "is", "cat"]

sentence_bleu(references, hypothesis)

def corpus_bleu(list_of_references, hypotheses, weights=(0.25, 0.25, 0.25, 0.25),

smoothing_function=None):

"""

:param references: a corpus of lists of reference sentences, w.r.t. hypotheses

:type references: list(list(list(str)))

:param hypotheses: a list of hypothesis sentences

:type hypotheses: list(list(str))

:param weights: weights for unigrams, bigrams, trigrams and so on

:type weights: list(float)

:return: The corpus-level BLEU score.

:rtype: float

"""

顺便说一句,因为在_nltk.translate .__ init __.py](https://github.com/nltk/nltk/blob/develop/nltk/translate/init.py#L21)中将sentence_bleu作为bleu导入,所以使用

from nltk.translate import bleu

会是这样的:

from nltk.translate.bleu_score import sentence_bleu

并在代码中:

>>> from nltk.translate import bleu

>>> from nltk.translate.bleu_score import sentence_bleu

>>> from nltk.translate.bleu_score import corpus_bleu

>>> bleu == sentence_bleu

True

>>> bleu == corpus_bleu

False

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值