数据来源SuperCLUE
榜单数据为通用能力排行榜
SuperCLUE总排行榜(2025年3月)
排名 | 模型名称 | 机构 | 总分 | 数学推理 | 科学推理 | 代码生成 | 智能体Agent | 精确指令遵循 | 文本理解与创作 | 发布日期 | |
---|---|---|---|---|---|---|---|---|---|---|---|
- | o3-mini(high) | OpenAI | 76.01 | 94.74 | 70.00 | 88.78 | 57.14 | 66.4 | 79.01 | 2025.03.18 | |
🏅️ | DeepSeek-R1 | 深度求索 | 70.33 | 85.96 | 64.00 | 86.94 | 65.18 | 39.52 | 80.41 | 2025.03.18 | |
- | Claude 3.7 Sonnet | Anthropic | 68.02 | 78.07 | 59.00 | 86.73 | 56.62 | 48.92 | 78.77 | 2025.03.18 | |
- | GPT-4.5-Preview | OpenAI | 67.46 | 67.54 | 70.00 | 79.18 | 71.88 | 35.75 | 80.4 | 2025.03.18 | |
🥈 | QwQ-32B | 阿里巴巴 | 66.38 | 88.6 | 67.00 | 81.84 | 48.66 | 29.92 | 82.27 | 2025.03.18 | |
- | Gemini-2.0-Pro-Exp-02-05 | | 65.35 | 65.79 | 70.71 | 77.76 | 64.88 | 33.6 | 79.34 | 2025.03.18 | |
🥉 | Doubao-1.5-pro-32k-250115 | 字节跳动 | 64.68 | 62.28 | 70.00 | 76.94 | 54.46 | 46.77 | 77.66 | 2025.03.18 | |
4 | hunyuan-turbos-20250226 | 腾讯 | 62.49 | 47.37 | 63.00 | 74.49 | 70.09 | 41.13 | 78.88 | 2025.03.18 | |
5 | DeepSeek-R1-Distill-Qwen-32B | 深度求索 | 59.94 | 85.85 | 62.89 | 73.43 | 36.77 | 23.18 | 77.53 | 2025.03.18 | |
5 | Qwen-max-latest | 阿里巴巴 | 59.34 | 42.98 | 68.00 | 76.33 | 58.48 | 29.38 | 80.88 | 2025.03.18 | |
- | Gemini-2.0-Flash-Thinking-Exp-01-21 | | 59.26 | 83.33 | 63.00 | 68.16 | 26.34 | 33.6 | 81.16 | 2025.03.18 | |
5 | 360智脑o1.5 | 360 | 59.08 | 83.33 | 57.00 | 71.43 | 36.61 | 26.34 | 79.78 | 2025.03.18 | |
6 | DeepSeek-V3 | 深度求索 | 57.63 | 48.25 | 63.00 | 68.78 | 63.39 | 23.39 | 78.99 | 2025.03.18 | |
- | ChatGPT-4o-latest | OpenAI | 57.57 | 35.96 | 66.00 | 73.06 | 56.7 | 32.8 | 80.89 | 2025.03.18 | |
7 | YAYI-Ultra | 中科闻歌 | 55.81 | 42.11 | 62.00 | 69.39 | 59.38 | 23.39 | 78.57 | 2025.03.18 | |
8 | Qwen2.5-72B-Instruct | 阿里巴巴 | 51.9 | 33.33 | 58.00 | 62.86 | 55.8 | 22.91 | 78.52 | 2025.03.18 | |
8 | kimi-latest | 月之暗面 | 51.47 | 27.19 | 54.00 | 70.61 | 62.05 | 19.89 | 75.1 | 2025.03.18 | |
9 | Step-2-16k | 阶跃星辰 | 50.81 | 26.32 | 58.00 | 62.45 | 59.38 | 18.55 | 80.17 | 2025.03.18 | |
10 | DeepSeek-R1-Distill-Qwen-14B | 深度求索 | 49.67 | 79.46 | 63.27 | 55.79 | 7.14 | 16.85 | 75.51 | 2025.03.18 | |
10 | Sky-Chat-3.0 | 昆仑万维 | 49.17 | 38.6 | 63.00 | 55.1 | 38.84 | 21.83 | 77.66 | 2025.03.18 | |
11 | GLM-4-Plus | 智谱AI | 48.61 | 26.32 | 53.00 | 61.84 | 49.55 | 21.77 | 79.17 | 2025.03.18 | |
12 | ERNIE-4.0-Turbo-8K-Latest | 百度 | 47.56 | 29.82 | 48.00 | 61.22 | 50.45 | 19.35 | 76.54 | 2025.03.18 | |
13 | GLM-Zero-Preview | 智谱AI | 46.11 | 74.56 | 64.00 | 41.02 | 8.48 | 16.94 | 71.64 | 2025.03.18 | |
- | Llama-3.3-70B-Instruct | Meta | 45.53 | 21.05 | 52.00 | 62.86 | 39.29 | 26.08 | 71.92 | 2025.03.18 | |
- | Phi-4 | 微软 | 45.26 | 35.09 | 61.00 | 60.2 | 23.83 | 15.05 | 76.37 | 2025.03.18 | |
- | GPT-4o mini | OpenAI | 43.8 | 21.05 | 53.00 | 63.06 | 29.02 | 20.43 | 76.22 | 2025.03.18 | |
14 | 讯飞星火V4.0 | 科大讯飞 | 40.76 | 39.82 | 49.00 | 51.22 | 16.52 | 12.63 | 75.36 | 2025.03.18 | |
14 | Qwen2.5-14b-Instruct | 阿里巴巴 | 40.7 | 21.05 | 48.00 | 50.61 | 32.59 | 15.09 | 76.87 | 2025.03.18 | |
15 | DeepSeek-R1-Distill-Qwen-7B | 深度求索 | 39.06 | 77.23 | 58.06 | 34.5 | 2.68 | 6.47 | 55.45 | 2025.03.18 | |
16 | Qwen2.5-7B-Instruct | 阿里巴巴 | 34.01 | 21.05 | 39.00 | 40 | 17.41 | 10.51 | 76.11 | 2025.03.18 | |
17 | InternLM3-8B-Instruct | 上海人工智能实验室 | 32.02 | 32.74 | 43.00 | 25.31 | 8.93 | 8.6 | 73.53 | 2025.03.18 | |
18 | GLM-4-9B-Chat | 智谱AI | 29.34 | 7.02 | 21.00 | 33.88 | 30.36 | 9.14 | 74.66 | 2025.03.18 | |
- | Gemma-2-9b-it | | 28.3 | 2.63 | 31.00 | 37.35 | 10.27 | 16.67 | 71.88 | 2025.03.18 | |
- | Llama-3.1-8B-Instruct | Meta | 25.42 | 1.75 | 19.00 | 31.02 | 23.66 | 10.48 | 66.63 | 2025.03.18 | |
19 | Yi-1.5-34B-Chat-16K | 零一万物 | 23.29 | 6.14 | 22.00 | 23.27 | 7.14 | 7.8 | 73.41 | 2025.03.18 | |
20 | Qwen2.5-3b-Instruct | 阿里巴巴 | 22.18 | 13.16 | 20.00 | 12.65 | 7.59 | 6.2 | 73.49 | 2025.03.18 | |
20 | Yi-1.5-9B-Chat-16K | 零一万物 | 21.94 | 4.42 | 19.00 | 14.49 | 14.75 | 7.53 | 71.42 | 2025.03.18 | |
21 | DeepSeek-R1-Distill-Qwen-1.5B | 深度求索 | 17.98 | 37.72 | 35.00 | 3.88 | 0 | 1.62 | 29.64 | 2025.03.18 | |
- | Llama-3.2-3B-Instruct | Meta | 17.15 | 7.89 | 5.00 | 18.78 | 3.57 | 5.48 | 62.17 | 2025.03.18 | |
- | Mistral-7B-Instruct-v0.3 | Mistral AI | 11.78 | 1.75 | 5.00 | 2.86 | 1.34 | 4.3 | 55.43 | 2025.03.18 |