书生·浦语大模型 第六课 作业

基础作业

  • 使用 OpenCompass 评测 InternLM2-Chat-7B 模型在 C-Eval 数据集上的性能

首先激活opencompass虚拟环境,找到InternLM2模型。

使用以下命令开启评测:

python run.py --datasets ceval_gen --hf-path /share/model_repos/internlm2-chat-7b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 2048 --max-out-len 16 --batch-size 4 --num-gpus 1 --debug

这是最终评测结果:

datasetversionmetricmodeopencompass.models.huggingface.HuggingFace_model_repos_internlm2-chat-7b
ceval-computer_networkdb9ce2accuracygen47.37
ceval-operating_system1c2571accuracygen57.89
ceval-computer_architecturea74dadaccuracygen42.86
ceval-college_programming4ca32aaccuracygen18.92
ceval-college_physics963fa8accuracygen10.53
ceval-college_chemistrye78857accuracygen4.17
ceval-advanced_mathematicsce03e2accuracygen0
ceval-probability_and_statistics65e812accuracygen16.67
ceval-discrete_mathematicse894aeaccuracygen18.75
ceval-electrical_engineerae42b9accuracygen24.32
ceval-metrology_engineeree34eaaccuracygen50
ceval-high_school_mathematics1dc5bfaccuracygen0
ceval-high_school_physicsadf25faccuracygen31.58
ceval-high_school_chemistry2ed27faccuracygen26.32
ceval-high_school_biology8e2b9aaccuracygen26.32
ceval-middle_school_mathematicsbee8d5accuracygen21.05
ceval-middle_school_biology86817caccuracygen66.67
ceval-middle_school_physics8accf6accuracygen57.89
ceval-middle_school_chemistry167a15accuracygen80
ceval-veterinary_medicineb4e08daccuracygen39.13
ceval-college_economicsf3f4e6accuracygen29.09
ceval-business_administrationc1614eaccuracygen33.33
ceval-marxismcf874caccuracygen84.21
ceval-mao_zedong_thought51c7a4accuracygen70.83
ceval-education_science591feeaccuracygen62.07
ceval-teacher_qualification4e4cedaccuracygen75
ceval-high_school_politics5c0de2accuracygen21.05
ceval-high_school_geography865461accuracygen42.11
ceval-middle_school_politics5be3e7accuracygen38.1
ceval-middle_school_geography8a63beaccuracygen50
ceval-modern_chinese_historyfc01afaccuracygen65.22
ceval-ideological_and_moral_cultivationa2aa4aaccuracygen89.47
ceval-logicf5b022accuracygen9.09
ceval-lawa110a1accuracygen37.5
ceval-chinese_language_and_literature0f8b68accuracygen47.83
ceval-art_studies2a1300accuracygen66.67
ceval-professional_tour_guide4e673eaccuracygen82.76
ceval-legal_professionalce8787accuracygen21.74
ceval-high_school_chinese315705accuracygen21.05
ceval-high_school_history7eb30aaccuracygen70
ceval-middle_school_history48ab4aaccuracygen63.64
ceval-civil_servant87d061accuracygen40.43
ceval-sports_science70f27baccuracygen68.42
ceval-plant_protection8941f9accuracygen72.73
ceval-basic_medicinec409d6accuracygen57.89
ceval-clinical_medicine49e82daccuracygen45.45
ceval-urban_and_rural_planner95b885accuracygen58.7
ceval-accountant2837accuracygen34.69
ceval-fire_engineerbc23f5accuracygen12.9
ceval-environmental_impact_assessment_engineerc64e2daccuracygen45.16
ceval-tax_accountant3a5e3caccuracygen42.86
ceval-physician6e277daccuracygen51.02
ceval-stem-naive_averagegen32.02
ceval-social-science-naive_averagegen50.58
ceval-humanities-naive_averagegen52.27
ceval-other-naive_averagegen48.2
ceval-hard-naive_averagegen13.5
ceval-naive_averagegen43.3
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值