研究生录取数据分析A
描述
本题附件包含500名国际高校的研究生申请人的相关信息和预测的录取概率数据。
下表为文件中字段及对应含义:
Serial No | GRE Score | TOEFL Score | University Rating | SOP | LOR | CGPA | Research | Chance of Admit |
---|---|---|---|---|---|---|---|---|
编号1-500 | GRE分数 | 托福分数 | 本科大学排名分 | 个人陈述分数 | 推荐信分数 | 本科绩点 | 研究经历(1/0) | 录取概率(0-1之间) |
研究经历:1代表有,0代表无
录取概率:0-1之间的小数,如0.73代表73%
请按照下列要求对文件中数据进行统计和分析,并严格按照下面所示格式输出结果。
(描述中示例仅为格式示例,数据与测试用例无关)
输入一个数据 n
:
- 如果
n
为1
,抽取数据中录取概率大于等于80%的记录,计算其中大学排名评分大于等于4分的百分比,程序结束。
输入:
1
输出:
Top University in >=80%:11.11%
- 如果
n
为Research
,分别统计和输出录取概率大于等于 90% 的学生和录取概率小于等于 70% 的学生中,有研究经历的学生占比,程序结束。(百分比保留两位小数)
输入:
Research
输出:
Research in >=90%:91.03%
Research in <=70%:22.10%
- 如果
n
为2
,输出录取概率大于等于 80% 的学生中TOEFL分数的平均分,最高分和最低分,程序结束。(保留两位小数)
输入:
2
输出:
TOEFL Average Score:300.12
TOEFL Max Score:323.00
TOEFL Min Score:299.00
- 如果
n
为3
,输出录取概率大于等于 80% 的学生中绩点的平均分,最高分和最低分,程序结束。(保留三位小数)
输入:
3
输出:
CGPA Average Score:4.333
CGPA Max Score:4.910
CGPA Min Score:4.134
- 如果非以上输入,则输出
ERROR
,程序结束。
参考代码
def read_file(filename):
"""读文件,将数据切分为列表,返回二维列表"""
with open(filename, 'r', encoding='utf-8') as fr:
data_ls = [i.strip().split(',') for i in fr]
return data_ls[1:]
def rank_four(data_ls):
"""接受录取概率大于num的数据的二维列表,
计算并返回其中大学排名评分大于等于4分的百分比"""
admit_80 = [x for x in data_ls if float(x[-1])>=0.8]
top_four = [x for x in admit_80 if float(x[1])>=4]
percent = round(len(top_four) / len(admit_80),4)*100
print(f'Top University in >=80%:{percent}%')
# return f'Top University in >=80%:{percent}%'
def toefl(data_ls):
admit_80 = [x for x in data_ls if float(x[-1])>=0.8]
toefl_score = [int(x[3]) for x in admit_80]
avg_toefl = sum(toefl_score)/len(toefl_score)
print(f'TOEFL Average Score:{avg_toefl:.2f}')
print(f'TOEFL Max Score:{max(toefl_score):.2f}')
print(f'TOEFL Min Score:{min(toefl_score):.2f}')
def gpa(data_ls):
admit_80 = [x for x in data_ls if float(x[-1])>=0.8]
gpa_score = [float(x[4]) for x in admit_80]
avg_toefl = sum(gpa_score)/len(gpa_score)
print(f'CGPA Average Score:{avg_toefl:.3f}')
print(f'CGPA Max Score:{max(gpa_score):.3f}')
print(f'CGPA Min Score:{min(gpa_score):.3f}')
def graduate_student(data_ls):
admit_90 = [x for x in data_ls if float(x[-1])>=0.9]
gpa_score = [x[5] for x in admit_90 if x[5]=='1']
percent = len(gpa_score)/len(admit_90)*100
print(f'Reseach in >=90%:{percent:.2f}%')
admit_90 = [x for x in data_ls if float(x[-1]) <= 0.7]
gpa_score = [x[5] for x in admit_90 if x[5] == '1']
percent = len(gpa_score) / len(admit_90) * 100
print(f'Reseach in <=70%:{percent:.2f}%')
def type_judge(input_str):
"""接收一个字符串为参数,根据参数调用不同的函数进行运算。
"""
if input_str == '1':
rank_four(data)
elif input_str == '2':
toefl(data)
elif input_str == '3':
gpa(data)
elif input_str == 'Research':
graduate_student(data)
else:
print('ERROR')
if __name__ == '__main__':
file = 'admit2.csv'
data = read_file(file)
question = input()
type_judge(question)
本人代码:
with open('admit2.csv', 'r', encoding='utf-8')as f:
data = [person.strip().split(',') for person in f.readlines()[1:]]
n = input()
if n == '1':
data = [person for person in data if eval(person[-1]) >= 0.8]
data_tar = [person for person in data if eval(person[1]) >= 4]
print('Top University in >=80%:{:.2f}%'.format(len(data_tar)/len(data)*100))
elif n == 'Research':
data1 = [person for person in data if eval(person[-1]) >= 0.9]
data1_tar = [person for person in data1 if eval(person[5]) == 1]
print('Research in >=90%:{:.2f}%'.format(len(data1_tar)/len(data1)*100))
data2 = [person for person in data if eval(person[-1]) <= 0.7]
data2_tar = [person for person in data2 if eval(person[5]) == 1]
print('Research in <=70%:{:.2f}%'.format(len(data2_tar)/len(data2)*100))
elif n == '2':
data = [eval(person[3]) for person in data if eval(person[-1]) >= 0.8]
print('TOEFL Average Score:{:.2f}'.format(sum(data)/len(data)))
print('TOEFL Max Score:{:.2f}'.format(max(data)))
print('TOEFL Min Score:{:.2f}'.format(min(data)))
elif n == '3':
data = [eval(person[4]) for person in data if eval(person[-1]) >= 0.8]
print('CGPA Average Score:{:.3f}'.format(sum(data)/len(data)))
print('CGPA Max Score:{:.3f}'.format(max(data)))
print('CGPA Min Score:{:.3f}'.format(min(data)))
else:
print('ERROR')