School Class ID Gender Address Height Weight Math Physics
0 S_1 C_1 1101 M street_1 1736334.0 A+1 S_1 C_1 1102 F street_2 1927332.5 B+2 S_1 C_1 1103 M street_2 1868287.2 B+3 S_1 C_1 1104 F street_2 1678180.4 B-4 S_1 C_1 1105 F street_4 1596484.8 B+
School Class ID Gender Address Height Weight Physics
Math
34.0 S_1 C_1 1101 M street_1 17363 A+32.5 S_1 C_1 1102 F street_2 19273 B+87.2 S_1 C_1 1103 M street_2 18682 B+80.4 S_1 C_1 1104 F street_2 16781 B-84.8 S_1 C_1 1105 F street_4 15964 B+
School Class ID Gender Address Height Weight Physics
Math
31.5 S_1 C_3 1301 M street_4 16168 B+32.5 S_1 C_1 1102 F street_2 19273 B+32.7 S_2 C_3 2302 M street_5 17188 A
33.8 S_1 C_2 1204 F street_5 16263 B
34.0 S_1 C_1 1101 M street_1 17363 A+
School Class ID Gender Address Height Weight Math Physics
0 S_1 C_1 1101 M street_1 1736334.0 A+19 S_2 C_1 2105 M street_4 1708134.2 A
18 S_2 C_1 2104 F street_5 1599772.2 B+16 S_2 C_1 2102 F street_6 1616150.6 B+15 S_2 C_1 2101 M street_7 1748483.3 C
====================================
School Class ID Gender Address Height Weight Math Physics
0 S_1 C_1 1101 M street_1 1736334.0 A+11 S_1 C_3 1302 F street_1 1755787.7 A-23 S_2 C_2 2204 M street_1 1757447.2 B-33 S_2 C_4 2404 F street_2 1608467.7 B
3 S_1 C_1 1104 F street_2 1678180.4 B-
import pandas as pd
import numpy as np
import operator
game_throne = pd.read_csv(r'data\Game_of_Thrones_Script.csv')print(game_throne.head(),'\n')print(game_throne.columns,'\n')print('gt_shape:', game_throne.shape,'\n')print("一共出现的人物数量是:")print(game_throne['Name'].nunique(),'\n')print("说话最多的人:")print(game_throne['Name'].value_counts().index[0],'\n')# 感觉apply比较好用,先添加一列,算每一行的单词数
game_throne['new_nwords']= game_throne['Sentence'].apply(lambda x:len(x.split(' ')))# 先将需要的两列提取出来
name_words =list(zip(game_throne['Name'], game_throne['new_nwords']))# 遍历上面的列表,将每个人的单词书存在字典里
words_count ={}for x in name_words:
words_count[x[0]]= words_count.get(x[0],0)+ x[1]# 对字典进行排序
words_man =sorted(words_count.items(), key=operator.itemgetter(1), reverse=True)[0][0]print("说单词最多的人:")print(words_man)
02011/4/17... What do you expect? They're savages. One lot s...12011/4/17... I've never seen wildlings do a thing like this...22011/4/17... How close did you get?
32011/4/17... Close asany man would.42011/4/17... We should head back to the wall.[5 rows x 6 columns]
Index(['Release Date','Season','Episode','Episode Title','Name','Sentence'],
dtype='object')
gt_shape:(23911,6)
一共出现的人物数量是:
564
说话最多的人:
tyrion lannister
说单词最多的人:
tyrion lannister