TL; DR:
>>> import nltk
>>> hypothesis = ['This', 'is', 'cat']
>>> reference = ['This', 'is', 'a', 'cat']
>>> references = [reference] # list of references for 1 sentence.
>>> list_of_references = [references] # list of references for all sentences in corpus.
>>> list_of_hypotheses = [hypothesis] # list of hypotheses that corresponds to list of references.
>>> nltk.translate.bleu_score.corpus_bleu(list_of_references, list_of_hypotheses)
0.6025286104785453
>>> nltk.translate.bleu_score.sentence_bleu(references, hypothesis)
0.6025286104785453
(注意:您必须在开发分支上提取最新版本的NLTK才能获得稳定版本的BLEU分数实现)
在龙:
实际上,如果整个语料库中只有一个引用和一个假设,则corpus_bleu()和sentence_bleu()都应该返回与上面示例中所示相同的值.
def