C-EVAL: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
https://arxiv.org/pdf/2305.08322v1.pdf
https://github.com/SJTU-LIT/ceval
https://cevalbenchmark.com/static/leaderboard.html
Part1 前言
怎么去评估一个大语言模型呢?
- 在广泛的NLP任务上进