那些免费的大模型API效果到底好不好？——CLiB大模型排行榜

本文链接：https://blog.csdn.net/easyllm/article/details/146514938

随着AI技术的普及，越来越多的平台推出了免费的大模型API，吸引了不少开发者和企业尝试。但免费是否意味着效果打折？在实际应用中，它们的表现能否满足需求？我们通过评测来一探究竟。

我们精选了15个大模型，它们均能长期提供免费API，但不包括那些限时免费、新用户短期体验的模型。这15个免费API部分是由其官网提供，部分由硅基流动提供（qwen2.5-7b-instruct，internlm2_5-7b-chat，glm-4-9b-chat，DeepSeek-R1-Distill-Qwen-7B，DeepSeek-R1-Distill-Qwen-1.5B）。其中Google的免费模型均为实验版本模型，并发有限。

评测维度：医疗、教育、法律、行政公务、心理健康、推理与数学计算、语言与指令遵从。

以下为排行榜：

排名	大模型	机构	价格	总分
1	gemini-2.0-flash-thinking-exp-01-21	Google	0.0元	74.20
2	gemini-2.0-pro-exp-02-05	Google	0.0元	74.00
3	gemini-2.0-flash-exp	Google	0.0元	70.10
4	GLM-4-Flash	智谱AI	0.0元	67.40
5	qwen2.5-7b-instruct	阿里巴巴	0.0元	66.70
6	internlm2_5-7b-chat	上海人工智能实验室	0.0元	64.70
7	glm-4-9b-chat	智谱AI	0.0元	63.50
8	ERNIE-Speed-8K	百度	0.0元	60.20
9	ERNIE-Lite-8K	百度	0.0元	56.70
10	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.0元	49.20
11	qwen2.5-1.5b-instruct	阿里巴巴	0.0元	47.30
12	xunfei-spark-lite	科大讯飞	0.0元	45.40
13	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.0元	37.80
14	qwen2.5-0.5b-instruct	阿里巴巴	0.0元	35.50
15	ERNIE-Tiny-8K	百度	0.0元	33.20

医疗领域排行榜：

医疗领域目前囊括11个维度：医师考试-规培结业，医师考试-执业助理医师，医师考试-执业医师，医师考试-中级职称，医师考试-高级职称，护理考试，药师考试，医技考试，专业知识考试-基础医学，专业知识考试-临床医学，专业知识考试-预防医学与公共卫生学。其中规培结业含外科、皮肤科等18个方向，执业助理医师含临床执业助理医师、口腔执业助理医师等5个方向，执业医师含中西医结合执业医师、公共卫生执业医师等5个方向……

排名	大模型	机构	价格	医疗
1	gemini-2.0-pro-exp-02-05	Google	0.0元	73.70
2	gemini-2.0-flash-thinking-exp-01-21	Google	0.0元	68.90
3	gemini-2.0-flash-exp	Google	0.0元	67.80
4	GLM-4-Flash	智谱AI	0.0元	67.20
5	internlm2_5-7b-chat	上海人工智能实验室	0.0元	65.60
6	qwen2.5-7b-instruct	阿里巴巴	0.0元	65.40
7	ERNIE-Speed-8K	百度	0.0元	60.10
8	glm-4-9b-chat	智谱AI	0.0元	57.70
9	ERNIE-Lite-8K	百度	0.0元	49.90
10	qwen2.5-1.5b-instruct	阿里巴巴	0.0元	45.00
11	xunfei-spark-lite	科大讯飞	0.0元	42.60
12	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.0元	30.90
13	qwen2.5-0.5b-instruct	阿里巴巴	0.0元	30.90
14	ERNIE-Tiny-8K	百度	0.0元	26.50
15	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.0元	25.50

教育领域排行榜：

教育领域目前囊括4个维度：高考，高中各学科，初中各学科，小学各学科。

排名	大模型	机构	价格	教育
1	gemini-2.0-pro-exp-02-05	Google	0.0元	86.60
2	gemini-2.0-flash-thinking-exp-01-21	Google	0.0元	83.80
3	gemini-2.0-flash-exp	Google	0.0元	81.90
4	GLM-4-Flash	智谱AI	0.0元	80.50
5	glm-4-9b-chat	智谱AI	0.0元	79.90
6	qwen2.5-7b-instruct	阿里巴巴	0.0元	79.70
7	internlm2_5-7b-chat	上海人工智能实验室	0.0元	72.20
8	ERNIE-Speed-8K	百度	0.0元	71.30
9	ERNIE-Lite-8K	百度	0.0元	70.60
10	qwen2.5-1.5b-instruct	阿里巴巴	0.0元	62.30
11	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.0元	62.30
12	xunfei-spark-lite	科大讯飞	0.0元	50.50
13	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.0元	46.80
14	qwen2.5-0.5b-instruct	阿里巴巴	0.0元	45.40
15	ERNIE-Tiny-8K	百度	0.0元	43.50

法律领域排行榜：

法律领域目前囊括1个维度：JEC-QA律师资格考试。

排名	大模型	机构	价格	法律
1	gemini-2.0-flash-thinking-exp-01-21	Google	0.0元	47.40
2	internlm2_5-7b-chat	上海人工智能实验室	0.0元	43.80
3	gemini-2.0-pro-exp-02-05	Google	0.0元	43.60
4	qwen2.5-7b-instruct	阿里巴巴	0.0元	42.50
5	GLM-4-Flash	智谱AI	0.0元	39.20
6	glm-4-9b-chat	智谱AI	0.0元	38.40
7	gemini-2.0-flash-exp	Google	0.0元	37.70
8	ERNIE-Speed-8K	百度	0.0元	30.80
9	ERNIE-Lite-8K	百度	0.0元	29.50
10	qwen2.5-1.5b-instruct	阿里巴巴	0.0元	28.10
11	xunfei-spark-lite	科大讯飞	0.0元	27.40
12	qwen2.5-0.5b-instruct	阿里巴巴	0.0元	21.70
13	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.0元	19.50
14	ERNIE-Tiny-8K	百度	0.0元	18.10
15	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.0元	12.90

行政公务领域排行榜：

行政公务领域目前囊括1个维度：公务员考试。

排名	大模型	机构	价格	行政公务
1	gemini-2.0-flash-thinking-exp-01-21	Google	0.0元	85.10
2	gemini-2.0-pro-exp-02-05	Google	0.0元	73.70
3	gemini-2.0-flash-exp	Google	0.0元	69.30
4	GLM-4-Flash	智谱AI	0.0元	64.50
5	glm-4-9b-chat	智谱AI	0.0元	64.10
6	internlm2_5-7b-chat	上海人工智能实验室	0.0元	62.40
7	qwen2.5-7b-instruct	阿里巴巴	0.0元	59.60
8	ERNIE-Speed-8K	百度	0.0元	54.50
9	ERNIE-Lite-8K	百度	0.0元	52.20
10	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.0元	48.80
11	qwen2.5-1.5b-instruct	阿里巴巴	0.0元	40.50
12	xunfei-spark-lite	科大讯飞	0.0元	37.50
13	ERNIE-Tiny-8K	百度	0.0元	31.00
14	qwen2.5-0.5b-instruct	阿里巴巴	0.0元	30.70
15	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.0元	26.40

心理健康领域排行榜：

心理健康领域目前囊括4个维度：MMCU心理，心理治疗学主治医师，心理咨询师，医学心理学。

排名	大模型	机构	价格	心理健康
1	GLM-4-Flash	智谱AI	0.0元	62.90
2	gemini-2.0-pro-exp-02-05	Google	0.0元	60.60
3	ERNIE-Speed-8K	百度	0.0元	57.30
4	qwen2.5-7b-instruct	阿里巴巴	0.0元	56.00
5	gemini-2.0-flash-exp	Google	0.0元	54.00
6	gemini-2.0-flash-thinking-exp-01-21	Google	0.0元	53.50
7	internlm2_5-7b-chat	上海人工智能实验室	0.0元	51.00
8	glm-4-9b-chat	智谱AI	0.0元	47.10
9	xunfei-spark-lite	科大讯飞	0.0元	43.40
10	ERNIE-Lite-8K	百度	0.0元	43.00
11	qwen2.5-1.5b-instruct	阿里巴巴	0.0元	39.60
12	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.0元	30.40
13	qwen2.5-0.5b-instruct	阿里巴巴	0.0元	24.50
14	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.0元	23.90
15	ERNIE-Tiny-8K	百度	0.0元	23.00

推理与数学计算领域排行榜：

推理与数学计算领域目前囊括6个维度：演绎推理，常识推理，符号推理BBH，算术能力，七八九年级数学，表格问答。

排名	大模型	机构	价格	推理与数学计算
1	gemini-2.0-flash-thinking-exp-01-21	Google	0.0元	93.90
2	gemini-2.0-flash-exp	Google	0.0元	92.80
3	gemini-2.0-pro-exp-02-05	Google	0.0元	92.00
4	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.0元	81.30
5	qwen2.5-7b-instruct	阿里巴巴	0.0元	80.20
6	GLM-4-Flash	智谱AI	0.0元	75.10
7	internlm2_5-7b-chat	上海人工智能实验室	0.0元	74.40
8	glm-4-9b-chat	智谱AI	0.0元	74.00
9	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.0元	72.00
10	ERNIE-Lite-8K	百度	0.0元	70.90
11	ERNIE-Speed-8K	百度	0.0元	66.40
12	qwen2.5-1.5b-instruct	阿里巴巴	0.0元	49.60
13	xunfei-spark-lite	科大讯飞	0.0元	48.00
14	qwen2.5-0.5b-instruct	阿里巴巴	0.0元	46.00
15	ERNIE-Tiny-8K	百度	0.0元	34.70

语言与指令遵从领域排行榜：

语言与指令遵从领域目前囊括10个维度：成语理解，情感分析，分类能力，信息抽取，阅读理解，C3中文阅读理解，代词理解CLUEWSC，诗词匹配CCPM，中文指令遵从。

排名	大模型	机构	价格	语言与指令遵从
1	gemini-2.0-pro-exp-02-05	Google	0.0元	87.50
2	gemini-2.0-flash-exp	Google	0.0元	87.00
3	gemini-2.0-flash-thinking-exp-01-21	Google	0.0元	86.90
4	internlm2_5-7b-chat	上海人工智能实验室	0.0元	83.70
5	qwen2.5-7b-instruct	阿里巴巴	0.0元	83.40
6	glm-4-9b-chat	智谱AI	0.0元	83.00
7	GLM-4-Flash	智谱AI	0.0元	82.70
8	ERNIE-Lite-8K	百度	0.0元	80.70
9	ERNIE-Speed-8K	百度	0.0元	80.70
10	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.0元	71.00
11	xunfei-spark-lite	科大讯飞	0.0元	68.20
12	qwen2.5-1.5b-instruct	阿里巴巴	0.0元	65.90
13	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.0元	57.10
14	ERNIE-Tiny-8K	百度	0.0元	55.40
15	qwen2.5-0.5b-instruct	阿里巴巴	0.0元	49.00