pandas 分组计数 取出前n条记录
总的来说,两行代码即可搞定。
test_data.groupby('release_year')['genre'].value_counts()
# output,结果为 Series
release_year genre
1960 Drama 13
Action 8
Comedy 8
Horror 7
Romance 6
Thriller 6
Western 6
Adventure 5
History 5
Family 3
Science Fiction 3
Crime 2
Fantasy 2
War 2
Foreign 1
Music 1
1961 Drama 16
Comedy 10
Action 7
Romance 7
Adventure 6
Family 5
Science Fiction 4
History 3
Horror 3
Western 3
Crime 2
...
Name: genre, Length: 1049, dtype: int64
# 将以上结果转换为 dataframe
my_data = test_data.groupby('release_year')['genre'].value_counts().rename('count').reset_index()
# output
release_year genre count
0 1960 Drama 13
1 1960 Action 8
2 1960 Comedy 8
3 1960 Horror 7
4 1960 Romance 6
5 1960 Thriller 6
6 1960 Western 6
7 1960 Adventure 5
8 1960 History 5
9 1960 Family 3
10 1960 Science Fiction 3
11 1960 Crime 2
12 1960 Fantasy 2
13 1960 War 2
14 1960 Foreign 1
15 1960 Music 1
16 1961 Drama 16
17 1961 Comedy 10
18 1961 Action 7
19 1961 Romance 7
20 1961 Adventure 6
21 1961 Family 5
22 1961 Science Fiction 4
23 1961 History 3
24 1961 Horror 3
25 1961 Western 3
26 1961 Crime 2
27 1961 Fantasy 2
28 1961 Music 2
29 1961 War 2
... ... ... ...
1019 2014 Crime 65
1020 2014 Science Fiction 62
1021 2014 Family 43
1022 2014 Animation 36
1023 2014 Fantasy 36
1024 2014 Mystery 36
1025 2014 Music 28
1026 2014 War 23
1027 2014 History 15
1028 2014 TV Movie 14
1029 2014 Western 6
1030 2015 Drama 260
1031 2015 Thriller 171
1032 2015 Comedy 162
1033 2015 Horror 125
1034 2015 Action 107
1035 2015 Science Fiction 86
1036 2015 Adventure 69
1037 2015 Documentary 57
1038 2015 Romance 57
1039 2015 Crime 51
1040 2015 Family 44
1041 2015 Mystery 42
1042 2015 Animation 39
1043 2015 Fantasy 33
1044 2015 Music 33
1045 2015 TV Movie 20
1046 2015 History 15
1047 2015 War 9
1048 2015 Western 6
1049 rows × 3 columns
# 最后,先groupby(), 然后获取每组中的前n条数据,结果为 dataframe
my_data.groupby('release_year').head(3)
# output
release_year genre count
0 1960 Drama 13
1 1960 Action 8
2 1960 Comedy 8
16 1961 Drama 16
17 1961 Comedy 10
18 1961 Action 7
33 1962 Drama 21
34 1962 Action 8
35 1962 Adventure 7
50 1963 Comedy 13
51 1963 Drama 13
52 1963 Thriller 10
67 1964 Drama 20
68 1964 Comedy 16
69 1964 Crime 10
85 1965 Drama 20
86 1965 Thriller 11
87 1965 Action 9
103 1966 Comedy 16
104 1966 Drama 16
105 1966 Action 14
121 1967 Comedy 17
122 1967 Drama 16
123 1967 Romance 11
138 1968 Drama 20
139 1968 Comedy 9
140 1968 Action 6
155 1969 Drama 13
156 1969 Comedy 12
157 1969 Action 10
... ... ... ...
853 2006 Drama 197
854 2006 Comedy 155
855 2006 Thriller 114
873 2007 Drama 197
874 2007 Comedy 151
875 2007 Thriller 125
893 2008 Drama 233
894 2008 Comedy 169
895 2008 Thriller 127
913 2009 Drama 224
914 2009 Comedy 198
915 2009 Thriller 157
932 2010 Drama 211
933 2010 Comedy 169
934 2010 Thriller 135
952 2011 Drama 214
953 2011 Comedy 172
954 2011 Thriller 146
972 2012 Drama 232
973 2012 Comedy 176
974 2012 Thriller 160
992 2013 Drama 253
993 2013 Comedy 175
994 2013 Thriller 175
1011 2014 Drama 284
1012 2014 Comedy 185
1013 2014 Thriller 179
1030 2015 Drama 260
1031 2015 Thriller 171
1032 2015 Comedy 162
168 rows × 3 columns