BLEU and ROUGE

最新推荐文章于 2024-09-26 23:51:44 发布

砰！

最新推荐文章于 2024-09-26 23:51:44 发布

阅读量419

点赞数

分类专栏： cousera NLP专项课程文章标签： python 自然语言处理

本文链接：https://blog.csdn.net/Harder_14/article/details/109314995

版权

本文介绍了BLEU和ROUGE两种评估自然语言处理中文本生成任务的指标。BLEU主要基于n-gram精确匹配，而ROUGE不仅考虑了unigram和bigram的重叠，还广泛用于文本摘要的评估。

摘要由CSDN通过智能技术生成

1.BLEU

reference = "The NASA Opportunity rover is battling a massive dust storm on planet Mars."
candidate_1 = "The Opportunity rover is combating a big sandstorm on planet Mars."
candidate_2 = "A NASA rover is fighting a massive storm on planet Mars."

tokenized_ref = nltk.word_tokenize(reference.lower())
tokenized_cand_1 = nltk.word_tokenize(candidate_1.lower())
tokenized_cand_2 = nltk.word_tokenize(candidate_2.lower())

def brevity_penalty(candidate, reference):
    ref_length = len(reference)
    can_length = len(candidate)

    # Brevity Penalty
    if ref_length < can_length: # if reference length is less than candidate length
        BP = 1 # set BP = 1
    else:
        penalty = 1 - (ref_length / can_length) # else set BP=exp(1-(ref_length/can_length))