llm-质量模型

最新推荐文章于 2024-07-17 08:51:23 发布

Deep_My

最新推荐文章于 2024-07-17 08:51:23 发布

阅读量661

点赞数 12

分类专栏： NLP 文章标签： chatgpt

本文链接：https://blog.csdn.net/weixin_46934960/article/details/136219608

版权

NLP 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

使用LLM，对文本质量进行评估

论文 Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: An Empirical Study

We compared $\text{\color{blue}three}$ kinds of reference-free evaluation methods. The experimental $\text{\color{blue}results prove that}$ ChatGPT is capable of evaluating text quality effectively from various perspectives without reference and demonstrates superior performance than most existing automatic metrics.
In particular, the $\text{\color{blue}Explicit Score}$ （直接让模型打分）， which utilizes ChatGPT to generate a numeric score measuring text quality, $\text{\color{blue}is the most effective and reliable method}$ among the three exploited approaches. However, directly comparing the quality of two texts may lead to sub-optimal results. We believe this paper will provide valuable insights for evaluating text quality with LLMs and have released the used data.

How accurately can ChatGPT assess text quality without references

It is feasible for ChatGPT to evaluate text quality without reference, and it outperforms commonly used metrics even with a simple prompt design.

What is the most effective approach to evaluate text quality using ChatGPT?

Generally, using ChatGPT to generate an explicit score for text quality is the best and most stable method among the three we compared. We suggest using greedy decoding for more reliable results.