数据挖掘4.Pandas高级处理

Pandas高级处理

1.缺失值处理

2.数据离散化

3.合并

4.交叉表和透视表

5.分组与聚合

#1.缺失值处理
#  1.1样本量很大,直接删除含有确实值的样本
#  1.2替换/插补   平均值/中位数来替换
#如何处理nan
#判断数据中是否存在NaN
#pd.isnull(df)  有缺失值则会在存在确实值的地方返回True
#pd.notnull(df)  不是缺失值标记为True

#进行进一步的处理
    #删除df.dropna(inplace=True)
        #按行删除  删除缺失值 若参数为True直接删除缺失样本,若为False,则返回一个已经删除了的新的DataFrame。默认为False
    #替换/插补df.fillna({col:value},inplace=)  value 为要填补的数
import pandas as pd
movie = pd.read_csv("./IMDB-Movie-Data.csv")
#判断是否存在缺失值
movie.head()
RankTitleGenreDescriptionDirectorActorsYearRuntime (Minutes)RatingVotesRevenue (Millions)Metascore
01Guardians of the GalaxyAction,Adventure,Sci-FiA group of intergalactic criminals are forced ...James GunnChris Pratt, Vin Diesel, Bradley Cooper, Zoe S...20141218.1757074333.1376.0
12PrometheusAdventure,Mystery,Sci-FiFollowing clues to the origin of mankind, a te...Ridley ScottNoomi Rapace, Logan Marshall-Green, Michael Fa...20121247.0485820126.4665.0
23SplitHorror,ThrillerThree girls are kidnapped by a man with a diag...M. Night ShyamalanJames McAvoy, Anya Taylor-Joy, Haley Lu Richar...20161177.3157606138.1262.0
34SingAnimation,Comedy,FamilyIn a city of humanoid animals, a hustling thea...Christophe LourdeletMatthew McConaughey,Reese Witherspoon, Seth Ma...20161087.260545270.3259.0
45Suicide SquadAction,Adventure,FantasyA secret government agency recruits some of th...David AyerWill Smith, Jared Leto, Margot Robbie, Viola D...20161236.2393727325.0240.0
import numpy as np
np.any(pd.isnull(movie))# any为有True即为True即有缺失值

True
np.all(pd.notnull(movie))# 返回False说明数据中有缺失值
False
pd.isnull(movie).any()#用DataFrame中的方法,返回每一个字段是否存在确实值的状况
Rank                  False
Title                 False
Genre                 False
Description           False
Director              False
Actors                False
Year                  False
Runtime (Minutes)     False
Rating                False
Votes                 False
Revenue (Millions)     True
Metascore              True
dtype: bool
pd.notnull(movie).all()
Rank                   True
Title                  True
Genre                  True
Description            True
Director               True
Actors                 True
Year                   True
Runtime (Minutes)      True
Rating                 True
Votes                  True
Revenue (Millions)    False
Metascore             False
dtype: bool
# 2.缺失值处理
# 方法1:删除含有缺失值的样本
data1 = movie.dropna()
pd.notnull(movie).all()
Rank                   True
Title                  True
Genre                  True
Description            True
Director               True
Actors                 True
Year                   True
Runtime (Minutes)      True
Rating                 True
Votes                  True
Revenue (Millions)    False
Metascore             False
dtype: bool
pd.notnull(data1).all()
Rank                  True
Title                 True
Genre                 True
Description           True
Director              True
Actors                True
Year                  True
Runtime (Minutes)     True
Rating                True
Votes                 True
Revenue (Millions)    True
Metascore             True
dtype: bool
# 方法二:替换  
#含有缺失值的字段:
#Revenue (Millions)    False
#Metascore             False
# 1.先求含有缺失值的这一列的平均值
movie.fillna({"Revenue (Millions)":movie["Revenue (Millions)"].mean()},inplace=True)
movie.fillna({"Metascore":movie["Metascore"].mean()},inplace=True)
movie.head()
RankTitleGenreDescriptionDirectorActorsYearRuntime (Minutes)RatingVotesRevenue (Millions)Metascore
01Guardians of the GalaxyAction,Adventure,Sci-FiA group of intergalactic criminals are forced ...James GunnChris Pratt, Vin Diesel, Bradley Cooper, Zoe S...20141218.1757074333.1376.0
12PrometheusAdventure,Mystery,Sci-FiFollowing clues to the origin of mankind, a te...Ridley ScottNoomi Rapace, Logan Marshall-Green, Michael Fa...20121247.0485820126.4665.0
23SplitHorror,ThrillerThree girls are kidnapped by a man with a diag...M. Night ShyamalanJames McAvoy, Anya Taylor-Joy, Haley Lu Richar...20161177.3157606138.1262.0
34SingAnimation,Comedy,FamilyIn a city of humanoid animals, a hustling thea...Christophe LourdeletMatthew McConaughey,Reese Witherspoon, Seth Ma...20161087.260545270.3259.0
45Suicide SquadAction,Adventure,FantasyA secret government agency recruits some of th...David AyerWill Smith, Jared Leto, Margot Robbie, Viola D...20161236.2393727325.0240.0
pd.notnull(movie).all() #缺失值已经处理完毕,不存在缺失值
Rank                  True
Title                 True
Genre                 True
Description           True
Director              True
Actors                True
Year                  True
Runtime (Minutes)     True
Rating                True
Votes                 True
Revenue (Millions)    True
Metascore             True
dtype: bool
#处理其他标记的缺失值
path = "https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data"
name = ["Sample code number", "Clump Thickness", "Uniformity of Cell Size", "Uniformity of Cell Shape", "Marginal Adhesion", "Single Epithelial Cell Size", "Bare Nuclei", "Bland Chromatin", "Normal Nucleoli", "Mitoses", "Class"]
data=pd.read_csv(path,names=name)
data.head()
Sample code numberClump ThicknessUniformity of Cell SizeUniformity of Cell ShapeMarginal AdhesionSingle Epithelial Cell SizeBare NucleiBland ChromatinNormal NucleoliMitosesClass
010000255111213112
1100294554457103212
210154253111223112
310162776881343712
410170234113213112
#处理思路
#1.将其替换为np.nan   df.replace(to_replace="?",value=np.nan)
#2.处理np.nan的步骤
data_new=data.replace(to_replace="?",value=np.nan)
data_new.dropna(inplace=True)
data_new.isnull().any()# 全部返回False则无缺失值
Sample code number             False
Clump Thickness                False
Uniformity of Cell Size        False
Uniformity of Cell Shape       False
Marginal Adhesion              False
Single Epithelial Cell Size    False
Bare Nuclei                    False
Bland Chromatin                False
Normal Nucleoli                False
Mitoses                        False
Class                          False
dtype: bool
#  数据离散化  one hot 编码  哑变量
# 当想表示类别这种离散的信息的时候
    #1.分组
        #1)自动分组:pd.qcut(data,bins)  data:数据  bins:分成几组
        #2)自定义分组:sr=pd.cut(data,[]) 设定好的区间以列表表示
    #2.将分组好的结果转换成one-hot编码
        #pd.get_dummies(sr,prefix=)  #前缀
# 1)准备数据
data = pd.Series([165,174,160,180,159,163,192,184], index=['No1:165', 'No2:174','No3:160', 'No4:180', 'No5:159', 'No6:163', 'No7:192', 'No8:184'])
data
No1:165    165
No2:174    174
No3:160    160
No4:180    180
No5:159    159
No6:163    163
No7:192    192
No8:184    184
dtype: int64
# 2)进行分组
 #自动分组
ar=pd.qcut(data,3)
ar
No1:165      (163.667, 178.0]
No2:174      (163.667, 178.0]
No3:160    (158.999, 163.667]
No4:180        (178.0, 192.0]
No5:159    (158.999, 163.667]
No6:163    (158.999, 163.667]
No7:192        (178.0, 192.0]
No8:184        (178.0, 192.0]
dtype: category
Categories (3, interval[float64, right]): [(158.999, 163.667] < (163.667, 178.0] < (178.0, 192.0]]
ar.value_counts()
(158.999, 163.667]    3
(178.0, 192.0]        3
(163.667, 178.0]      2
Name: count, dtype: int64
# 3)转换成one-hot编码
pd.get_dummies(ar,prefix="height")
height_(158.999, 163.667]height_(163.667, 178.0]height_(178.0, 192.0]
No1:165FalseTrueFalse
No2:174FalseTrueFalse
No3:160TrueFalseFalse
No4:180FalseFalseTrue
No5:159TrueFalseFalse
No6:163TrueFalseFalse
No7:192FalseFalseTrue
No8:184FalseFalseTrue
#自定义分组
bins = [150, 165, 180, 195]
sr = pd.cut(data,bins)
sr
No1:165    (150, 165]
No2:174    (165, 180]
No3:160    (150, 165]
No4:180    (165, 180]
No5:159    (150, 165]
No6:163    (150, 165]
No7:192    (180, 195]
No8:184    (180, 195]
dtype: category
Categories (3, interval[int64, right]): [(150, 165] < (165, 180] < (180, 195]]
sr.value_counts()
(150, 165]    4
(165, 180]    2
(180, 195]    2
Name: count, dtype: int64
# get_dummies
pd.get_dummies(sr,prefix="身高")
身高_(150, 165]身高_(165, 180]身高_(180, 195]
No1:165TrueFalseFalse
No2:174FalseTrueFalse
No3:160TrueFalseFalse
No4:180FalseTrueFalse
No5:159TrueFalseFalse
No6:163TrueFalseFalse
No7:192FalseFalseTrue
No8:184FalseFalseTrue
## 案例
stock = pd.read_csv("./stock_day/stock_day.csv")
p_change = stock["p_change"]
p_change.head()
2018-02-27    2.68
2018-02-26    3.02
2018-02-23    2.42
2018-02-22    1.64
2018-02-14    2.05
Name: p_change, dtype: float64
# 2)分组
sr = pd.qcut(p_change,10)
sr.value_counts()
p_change
(-10.030999999999999, -4.836]    65
(-0.462, 0.26]                   65
(0.26, 0.94]                     65
(5.27, 10.03]                    65
(-4.836, -2.444]                 64
(-2.444, -1.352]                 64
(-1.352, -0.462]                 64
(1.738, 2.938]                   64
(2.938, 5.27]                    64
(0.94, 1.738]                    63
Name: count, dtype: int64
# 3)离散化
pd.get_dummies(sr, prefix="涨跌幅")
涨跌幅_(-10.030999999999999, -4.836]涨跌幅_(-4.836, -2.444]涨跌幅_(-2.444, -1.352]涨跌幅_(-1.352, -0.462]涨跌幅_(-0.462, 0.26]涨跌幅_(0.26, 0.94]涨跌幅_(0.94, 1.738]涨跌幅_(1.738, 2.938]涨跌幅_(2.938, 5.27]涨跌幅_(5.27, 10.03]
2018-02-27FalseFalseFalseFalseFalseFalseFalseTrueFalseFalse
2018-02-26FalseFalseFalseFalseFalseFalseFalseFalseTrueFalse
2018-02-23FalseFalseFalseFalseFalseFalseFalseTrueFalseFalse
2018-02-22FalseFalseFalseFalseFalseFalseTrueFalseFalseFalse
2018-02-14FalseFalseFalseFalseFalseFalseFalseTrueFalseFalse
.................................
2015-03-06FalseFalseFalseFalseFalseFalseFalseFalseFalseTrue
2015-03-05FalseFalseFalseFalseFalseFalseFalseTrueFalseFalse
2015-03-04FalseFalseFalseFalseFalseFalseTrueFalseFalseFalse
2015-03-03FalseFalseFalseFalseFalseFalseTrueFalseFalseFalse
2015-03-02FalseFalseFalseFalseFalseFalseFalseTrueFalseFalse

643 rows × 10 columns

bins = [-100,-7,-5,-3,0,3,5,7,100]
sr = pd.cut(p_change,bins)
sr
2018-02-27      (0, 3]
2018-02-26      (3, 5]
2018-02-23      (0, 3]
2018-02-22      (0, 3]
2018-02-14      (0, 3]
                ...   
2015-03-06    (7, 100]
2015-03-05      (0, 3]
2015-03-04      (0, 3]
2015-03-03      (0, 3]
2015-03-02      (0, 3]
Name: p_change, Length: 643, dtype: category
Categories (8, interval[int64, right]): [(-100, -7] < (-7, -5] < (-5, -3] < (-3, 0] < (0, 3] < (3, 5] < (5, 7] < (7, 100]]
sr.value_counts()
p_change
(0, 3]        215
(-3, 0]       188
(3, 5]         57
(-5, -3]       51
(5, 7]         35
(7, 100]       35
(-100, -7]     34
(-7, -5]       28
Name: count, dtype: int64
stock_change = pd.get_dummies(sr,prefix="rise")
stock_change.head()
rise_(-100, -7]rise_(-7, -5]rise_(-5, -3]rise_(-3, 0]rise_(0, 3]rise_(3, 5]rise_(5, 7]rise_(7, 100]
2018-02-27FalseFalseFalseFalseTrueFalseFalseFalse
2018-02-26FalseFalseFalseFalseFalseTrueFalseFalse
2018-02-23FalseFalseFalseFalseTrueFalseFalseFalse
2018-02-22FalseFalseFalseFalseTrueFalseFalseFalse
2018-02-14FalseFalseFalseFalseTrueFalseFalseFalse
# 合并
# numpy中
    #np.concatnate((a,b),axis=)
    #np.hstack()
    #np.vstack()

#pandas当中
    #1)按方向拼接 pd.concat([data1,data2],axis=0) 默认为0,竖直拼接
stock.head()
        
openhighcloselowvolumeprice_changep_changema5ma10ma20v_ma5v_ma10v_ma20turnover
2018-02-2723.5325.8824.1623.5395578.030.632.6822.94222.14222.87553782.6446738.6555576.112.39
2018-02-2622.8023.7823.5322.8060985.110.693.0222.40621.95522.94240827.5242736.3456007.501.53
2018-02-2322.8823.3722.8222.7152914.010.542.4221.93821.92923.02235119.5841871.9756372.851.32
2018-02-2222.2522.7622.2822.0236105.010.361.6421.44621.90923.13735397.5839904.7860149.600.90
2018-02-1421.4921.9921.9221.4823331.040.442.0521.36621.92323.25333590.2142935.7461716.110.58
stock_change.head()
rise_(-100, -7]rise_(-7, -5]rise_(-5, -3]rise_(-3, 0]rise_(0, 3]rise_(3, 5]rise_(5, 7]rise_(7, 100]
2018-02-27FalseFalseFalseFalseTrueFalseFalseFalse
2018-02-26FalseFalseFalseFalseFalseTrueFalseFalse
2018-02-23FalseFalseFalseFalseTrueFalseFalseFalse
2018-02-22FalseFalseFalseFalseTrueFalseFalseFalse
2018-02-14FalseFalseFalseFalseTrueFalseFalseFalse
pd.concat([stock,stock_change],axis=1)
openhighcloselowvolumeprice_changep_changema5ma10ma20...v_ma20turnoverrise_(-100, -7]rise_(-7, -5]rise_(-5, -3]rise_(-3, 0]rise_(0, 3]rise_(3, 5]rise_(5, 7]rise_(7, 100]
2018-02-2723.5325.8824.1623.5395578.030.632.6822.94222.14222.875...55576.112.39FalseFalseFalseFalseTrueFalseFalseFalse
2018-02-2622.8023.7823.5322.8060985.110.693.0222.40621.95522.942...56007.501.53FalseFalseFalseFalseFalseTrueFalseFalse
2018-02-2322.8823.3722.8222.7152914.010.542.4221.93821.92923.022...56372.851.32FalseFalseFalseFalseTrueFalseFalseFalse
2018-02-2222.2522.7622.2822.0236105.010.361.6421.44621.90923.137...60149.600.90FalseFalseFalseFalseTrueFalseFalseFalse
2018-02-1421.4921.9921.9221.4823331.040.442.0521.36621.92323.253...61716.110.58FalseFalseFalseFalseTrueFalseFalseFalse
..................................................................
2015-03-0613.1714.4814.2813.13179831.721.128.5113.11213.11213.112...115090.186.16NaNNaNNaNNaNNaNNaNNaNNaN
2015-03-0512.8813.4513.1612.8793180.390.262.0212.82012.82012.820...98904.793.19NaNNaNNaNNaNNaNNaNNaNNaN
2015-03-0412.8012.9212.9012.6167075.440.201.5712.70712.70712.707...100812.932.30NaNNaNNaNNaNNaNNaNNaNNaN
2015-03-0312.5213.0612.7012.52139071.610.181.4412.61012.61012.610...117681.674.76NaNNaNNaNNaNNaNNaNNaNNaN
2015-03-0212.2512.6712.5212.2096291.730.322.6212.52012.52012.520...96291.733.30NaNNaNNaNNaNNaNNaNNaNNaN

643 rows × 22 columns

pd.concat([stock,stock_change],axis=0)
openhighcloselowvolumeprice_changep_changema5ma10ma20...v_ma20turnoverrise_(-100, -7]rise_(-7, -5]rise_(-5, -3]rise_(-3, 0]rise_(0, 3]rise_(3, 5]rise_(5, 7]rise_(7, 100]
2018-02-2723.5325.8824.1623.5395578.030.632.6822.94222.14222.875...55576.112.39NaNNaNNaNNaNNaNNaNNaNNaN
2018-02-2622.8023.7823.5322.8060985.110.693.0222.40621.95522.942...56007.501.53NaNNaNNaNNaNNaNNaNNaNNaN
2018-02-2322.8823.3722.8222.7152914.010.542.4221.93821.92923.022...56372.851.32NaNNaNNaNNaNNaNNaNNaNNaN
2018-02-2222.2522.7622.2822.0236105.010.361.6421.44621.90923.137...60149.600.90NaNNaNNaNNaNNaNNaNNaNNaN
2018-02-1421.4921.9921.9221.4823331.040.442.0521.36621.92323.253...61716.110.58NaNNaNNaNNaNNaNNaNNaNNaN
..................................................................
2018-02-27NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNFalseFalseFalseFalseTrueFalseFalseFalse
2018-02-26NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNFalseFalseFalseFalseFalseTrueFalseFalse
2018-02-23NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNFalseFalseFalseFalseTrueFalseFalseFalse
2018-02-22NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNFalseFalseFalseFalseTrueFalseFalseFalse
2018-02-14NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNFalseFalseFalseFalseTrueFalseFalseFalse

648 rows × 22 columns

# 按索引拼接pd.merge(left,right,how="inner",on=[索引])
left = pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
                        'key2': ['K0', 'K1', 'K0', 'K1'],
                        'A': ['A0', 'A1', 'A2', 'A3'],
                        'B': ['B0', 'B1', 'B2', 'B3']})

right = pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
                        'key2': ['K0', 'K0', 'K0', 'K0'],
                        'C': ['C0', 'C1', 'C2', 'C3'],
                        'D': ['D0', 'D1', 'D2', 'D3']})

left
key1key2AB
0K0K0A0B0
1K0K1A1B1
2K1K0A2B2
3K2K1A3B3
right
key1key2CD
0K0K0C0D0
1K1K0C1D1
2K1K0C2D2
3K2K0C3D3
pd.merge(left,right,how="inner",on=["key1","key2"])#内连接
key1key2ABCD
0K0K0A0B0C0D0
1K1K0A2B2C1D1
2K1K0A2B2C2D2
pd.merge(left,right,how="left",on=["key1","key2"])#左连接
key1key2ABCD
0K0K0A0B0C0D0
1K0K1A1B1NaNNaN
2K1K0A2B2C1D1
3K1K0A2B2C2D2
4K2K1A3B3NaNNaN
pd.merge(left,right,how="right",on=["key1","key2"])#右连接
key1key2ABCD
0K0K0A0B0C0D0
1K1K0A2B2C1D1
2K1K0A2B2C2D2
3K2K0NaNNaNC3D3
pd.merge(left,right,how="outer",on=["key1","key2"])#外连接
key1key2ABCD
0K0K0A0B0C0D0
1K0K1A1B1NaNNaN
2K1K0A2B2C1D1
3K1K0A2B2C2D2
4K2K0NaNNaNC3D3
5K2K1A3B3NaNNaN
# 交叉表与透视表:
# 探究两个变量之间关系
#交叉表:寻找一列数据对于另外一列数据的分组个数(寻找两个列之间的关系)
#pd.crosstab(value1,value2)
#比较星期几与涨跌幅之间的关系pd.crosstab(星期数据列,涨跌幅数据列)
stock.index
Index(['2018-02-27', '2018-02-26', '2018-02-23', '2018-02-22', '2018-02-14',
       '2018-02-13', '2018-02-12', '2018-02-09', '2018-02-08', '2018-02-07',
       ...
       '2015-03-13', '2015-03-12', '2015-03-11', '2015-03-10', '2015-03-09',
       '2015-03-06', '2015-03-05', '2015-03-04', '2015-03-03', '2015-03-02'],
      dtype='object', length=643)
#pandas日期类型
data = pd.to_datetime(stock.index)
data.year
Index([2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,
       ...
       2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015],
      dtype='int32', length=643)
data.weekday #返回星期几
Index([1, 0, 4, 3, 2, 1, 0, 4, 3, 2,
       ...
       4, 3, 2, 1, 0, 4, 3, 2, 1, 0],
      dtype='int32', length=643)
#准备星期数据列
stock["week"] = data.weekday
stock
openhighcloselowvolumeprice_changep_changema5ma10ma20v_ma5v_ma10v_ma20turnoverweek
2018-02-2723.5325.8824.1623.5395578.030.632.6822.94222.14222.87553782.6446738.6555576.112.391
2018-02-2622.8023.7823.5322.8060985.110.693.0222.40621.95522.94240827.5242736.3456007.501.530
2018-02-2322.8823.3722.8222.7152914.010.542.4221.93821.92923.02235119.5841871.9756372.851.324
2018-02-2222.2522.7622.2822.0236105.010.361.6421.44621.90923.13735397.5839904.7860149.600.903
2018-02-1421.4921.9921.9221.4823331.040.442.0521.36621.92323.25333590.2142935.7461716.110.582
................................................
2015-03-0613.1714.4814.2813.13179831.721.128.5113.11213.11213.112115090.18115090.18115090.186.164
2015-03-0512.8813.4513.1612.8793180.390.262.0212.82012.82012.82098904.7998904.7998904.793.193
2015-03-0412.8012.9212.9012.6167075.440.201.5712.70712.70712.707100812.93100812.93100812.932.302
2015-03-0312.5213.0612.7012.52139071.610.181.4412.61012.61012.610117681.67117681.67117681.674.761
2015-03-0212.2512.6712.5212.2096291.730.322.6212.52012.52012.52096291.7396291.7396291.733.300

643 rows × 15 columns

#准备涨跌幅数据列
stock["pona"] = np.where(stock["p_change"]>0,1,0)
stock
openhighcloselowvolumeprice_changep_changema5ma10ma20v_ma5v_ma10v_ma20turnoverweekpona
2018-02-2723.5325.8824.1623.5395578.030.632.6822.94222.14222.87553782.6446738.6555576.112.3911
2018-02-2622.8023.7823.5322.8060985.110.693.0222.40621.95522.94240827.5242736.3456007.501.5301
2018-02-2322.8823.3722.8222.7152914.010.542.4221.93821.92923.02235119.5841871.9756372.851.3241
2018-02-2222.2522.7622.2822.0236105.010.361.6421.44621.90923.13735397.5839904.7860149.600.9031
2018-02-1421.4921.9921.9221.4823331.040.442.0521.36621.92323.25333590.2142935.7461716.110.5821
...................................................
2015-03-0613.1714.4814.2813.13179831.721.128.5113.11213.11213.112115090.18115090.18115090.186.1641
2015-03-0512.8813.4513.1612.8793180.390.262.0212.82012.82012.82098904.7998904.7998904.793.1931
2015-03-0412.8012.9212.9012.6167075.440.201.5712.70712.70712.707100812.93100812.93100812.932.3021
2015-03-0312.5213.0612.7012.52139071.610.181.4412.61012.61012.610117681.67117681.67117681.674.7611
2015-03-0212.2512.6712.5212.2096291.730.322.6212.52012.52012.52096291.7396291.7396291.733.3001

643 rows × 16 columns

#调用交叉表  展示样例数目
data = pd.crosstab(stock["week"],stock["pona"])
data
pona01
week
06362
15576
26171
36365
45968
data.div(data.sum(axis=1),axis=0)
pona01
week
00.5040000.496000
10.4198470.580153
20.4621210.537879
30.4921880.507812
40.4645670.535433
data.div(data.sum(axis=1),axis=0).plot(kind="bar",stacked=True)
<Axes: xlabel='week'>

在这里插入图片描述

data.div(data.sum(axis=1),axis=0)
pona01
week
00.5040000.496000
10.4198470.580153
20.4621210.537879
30.4921880.507812
40.4645670.535433
# 透视表 dataframe.pivot_table 直接展示比例
stock.pivot_table(["pona"],["week"])# 将pona字段按week进行分组,按1结果展示
pona
week
00.496000
10.580153
20.537879
30.507812
40.535433
# 分组与聚合
# dataframe
col =pd.DataFrame({'color': ['white','red','green','red','green'], 'object': ['pen','pencil','pencil','ashtray','pen'],'price1':[5.56,4.20,1.30,0.56,2.75],'price2':[4.75,4.12,1.60,0.75,3.15]})

col
colorobjectprice1price2
0whitepen5.564.75
1redpencil4.204.12
2greenpencil1.301.60
3redashtray0.560.75
4greenpen2.753.15
#对颜色进行分组,对price1进行聚合
col.groupby(by="color")["price1"].max()
color
green    2.75
red      4.20
white    5.56
Name: price1, dtype: float64
#用series来
col["price1"].groupby(col["color"]).max()
color
green    2.75
red      4.20
white    5.56
Name: price1, dtype: float64
#1.准备数据
starbucks = pd.read_csv("directory.csv")
starbucks.head()
BrandStore NumberStore NameOwnership TypeStreet AddressCityState/ProvinceCountryPostcodePhone NumberTimezoneLongitudeLatitude
0Starbucks47370-257954Meritxell, 96LicensedAv. Meritxell, 96Andorra la Vella7ADAD500376818720GMT+1:00 Europe/Andorra1.5342.51
1Starbucks22331-212325Ajman Drive ThruLicensed1 Street 69, Al JarfAjmanAJAENaNNaNGMT+04:00 Asia/Dubai55.4725.42
2Starbucks47089-256771Dana MallLicensedSheikh Khalifa Bin Zayed St.AjmanAJAENaNNaNGMT+04:00 Asia/Dubai55.4725.39
3Starbucks22126-218024Twofour 54LicensedAl Salam StreetAbu DhabiAZAENaNNaNGMT+04:00 Asia/Dubai54.3824.48
4Starbucks17127-178586Al Ain TowerLicensedKhaldiya Area, Abu Dhabi IslandAbu DhabiAZAENaNNaNGMT+04:00 Asia/Dubai54.5424.51
starbucks.groupby("Country").count()["Brand"].sort_values(ascending=False)[:10].plot(kind="bar",)
<Axes: xlabel='Country'>

在这里插入图片描述

#按照多个标准进行分组
starbucks.groupby(by= ["Country","State/Province"]).count()
BrandStore NumberStore NameOwnership TypeStreet AddressCityPostcodePhone NumberTimezoneLongitudeLatitude
CountryState/Province
AD711111111111
AEAJ22222200222
AZ484848484848720484848
DU8282828282821650828282
FU22222210222
.......................................
USWV2525252525252523252525
WY2323232323232322232323
VNHN66666666666
SG1919191919191917191919
ZAGT33333332333

545 rows × 11 columns

综合案例

#1.准备数据
movie = pd.read_csv("IMDB-Movie-Data.csv")
movie
RankTitleGenreDescriptionDirectorActorsYearRuntime (Minutes)RatingVotesRevenue (Millions)Metascore
01Guardians of the GalaxyAction,Adventure,Sci-FiA group of intergalactic criminals are forced ...James GunnChris Pratt, Vin Diesel, Bradley Cooper, Zoe S...20141218.1757074333.1376.0
12PrometheusAdventure,Mystery,Sci-FiFollowing clues to the origin of mankind, a te...Ridley ScottNoomi Rapace, Logan Marshall-Green, Michael Fa...20121247.0485820126.4665.0
23SplitHorror,ThrillerThree girls are kidnapped by a man with a diag...M. Night ShyamalanJames McAvoy, Anya Taylor-Joy, Haley Lu Richar...20161177.3157606138.1262.0
34SingAnimation,Comedy,FamilyIn a city of humanoid animals, a hustling thea...Christophe LourdeletMatthew McConaughey,Reese Witherspoon, Seth Ma...20161087.260545270.3259.0
45Suicide SquadAction,Adventure,FantasyA secret government agency recruits some of th...David AyerWill Smith, Jared Leto, Margot Robbie, Viola D...20161236.2393727325.0240.0
.......................................
995996Secret in Their EyesCrime,Drama,MysteryA tight-knit team of rising investigators, alo...Billy RayChiwetel Ejiofor, Nicole Kidman, Julia Roberts...20151116.227585NaN45.0
996997Hostel: Part IIHorrorThree American college students studying abroa...Eli RothLauren German, Heather Matarazzo, Bijou Philli...2007945.57315217.5446.0
997998Step Up 2: The StreetsDrama,Music,RomanceRomantic sparks occur between two dance studen...Jon M. ChuRobert Hoffman, Briana Evigan, Cassie Ventura,...2008986.27069958.0150.0
998999Search PartyAdventure,ComedyA pair of friends embark on a mission to reuni...Scot ArmstrongAdam Pally, T.J. Miller, Thomas Middleditch,Sh...2014935.64881NaN22.0
9991000Nine LivesComedy,Family,FantasyA stuffy businessman finds himself trapped ins...Barry SonnenfeldKevin Spacey, Jennifer Garner, Robbie Amell,Ch...2016875.31243519.6411.0

1000 rows × 12 columns

#评分的平均分
movie["Rating"].mean()
6.723199999999999
#导演的人数
np.unique(movie["Director"]).size
644
#对于这一组电影数据,如果想看rating,runtime的分布情况,应该如何呈现数据
movie["Rating"].plot(kind="hist")
<Axes: ylabel='Frequency'>

在这里插入图片描述

import matplotlib.pyplot as plt
#1.创建画布
plt.figure(figsize=(20,8),dpi=80)
#2.绘制直方图
plt.hist(movie["Rating"],20)
#修改刻度
plt.xticks(np.linspace(movie["Rating"].min(),movie["Rating"].max(),21))
#添加网格
plt.grid(linestyle="--",alpha=0.5)
#3.显示图像
plt.show()

在这里插入图片描述

#如何希望统计电影分类(genre)的情况,应该如何处理数据
#1.创建一个列表存储一共有哪些类别
movie_genre = [i.split(",") for i in movie["Genre"]]
    
movie_genre
[['Action', 'Adventure', 'Sci-Fi'],
 ['Adventure', 'Mystery', 'Sci-Fi'],
 ['Horror', 'Thriller'],
 ['Animation', 'Comedy', 'Family'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Comedy', 'Drama', 'Music'],
 ['Comedy'],
 ['Action', 'Adventure', 'Biography'],
 ['Adventure', 'Drama', 'Romance'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Biography', 'Drama', 'History'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Comedy', 'Drama'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Biography', 'Drama', 'History'],
 ['Action', 'Thriller'],
 ['Biography', 'Drama'],
 ['Drama', 'Mystery', 'Sci-Fi'],
 ['Adventure', 'Drama', 'Thriller'],
 ['Drama'],
 ['Crime', 'Drama', 'Horror'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Comedy'],
 ['Action', 'Adventure', 'Drama'],
 ['Horror', 'Thriller'],
 ['Comedy'],
 ['Action', 'Adventure', 'Drama'],
 ['Comedy'],
 ['Drama', 'Thriller'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Adventure', 'Comedy'],
 ['Action', 'Horror', 'Sci-Fi'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Adventure', 'Drama', 'Sci-Fi'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Adventure', 'Western'],
 ['Comedy', 'Drama'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Drama'],
 ['Horror'],
 ['Biography', 'Drama', 'History'],
 ['Drama'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Drama', 'Thriller'],
 ['Adventure', 'Drama', 'Fantasy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Drama'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Comedy', 'Drama'],
 ['Action', 'Crime', 'Thriller'],
 ['Action', 'Crime', 'Drama'],
 ['Adventure', 'Drama', 'History'],
 ['Crime', 'Horror', 'Thriller'],
 ['Drama', 'Romance'],
 ['Comedy', 'Drama', 'Romance'],
 ['Biography', 'Drama'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Horror', 'Mystery', 'Thriller'],
 ['Crime', 'Drama', 'Mystery'],
 ['Drama', 'Romance', 'Thriller'],
 ['Drama', 'Mystery', 'Sci-Fi'],
 ['Action', 'Adventure', 'Comedy'],
 ['Drama', 'History', 'Thriller'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Drama'],
 ['Action', 'Drama', 'Thriller'],
 ['Drama', 'History'],
 ['Action', 'Drama', 'Romance'],
 ['Drama', 'Fantasy'],
 ['Drama', 'Romance'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Sci-Fi'],
 ['Adventure', 'Drama', 'War'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Comedy', 'Fantasy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Comedy', 'Drama'],
 ['Biography', 'Comedy', 'Crime'],
 ['Crime', 'Drama', 'Mystery'],
 ['Action', 'Crime', 'Thriller'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Crime', 'Drama'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Crime', 'Drama', 'Mystery'],
 ['Action', 'Crime', 'Drama'],
 ['Crime', 'Drama', 'Mystery'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Drama'],
 ['Comedy', 'Crime', 'Drama'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Comedy', 'Crime'],
 ['Animation', 'Drama', 'Fantasy'],
 ['Horror', 'Mystery', 'Sci-Fi'],
 ['Drama', 'Mystery', 'Thriller'],
 ['Crime', 'Drama', 'Thriller'],
 ['Biography', 'Crime', 'Drama'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Adventure', 'Drama', 'Sci-Fi'],
 ['Crime', 'Mystery', 'Thriller'],
 ['Action', 'Adventure', 'Comedy'],
 ['Crime', 'Drama', 'Thriller'],
 ['Comedy'],
 ['Action', 'Adventure', 'Drama'],
 ['Drama'],
 ['Drama', 'Mystery', 'Sci-Fi'],
 ['Action', 'Horror', 'Thriller'],
 ['Biography', 'Drama', 'History'],
 ['Romance', 'Sci-Fi'],
 ['Action', 'Fantasy', 'War'],
 ['Adventure', 'Drama', 'Fantasy'],
 ['Comedy'],
 ['Horror', 'Thriller'],
 ['Action', 'Biography', 'Drama'],
 ['Drama', 'Horror', 'Mystery'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Adventure', 'Drama', 'Family'],
 ['Adventure', 'Mystery', 'Sci-Fi'],
 ['Adventure', 'Comedy', 'Romance'],
 ['Action'],
 ['Action', 'Thriller'],
 ['Adventure', 'Drama', 'Family'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Adventure', 'Crime', 'Mystery'],
 ['Comedy', 'Family', 'Musical'],
 ['Adventure', 'Drama', 'Thriller'],
 ['Drama'],
 ['Adventure', 'Comedy', 'Drama'],
 ['Drama', 'Horror', 'Thriller'],
 ['Drama', 'Music'],
 ['Action', 'Crime', 'Thriller'],
 ['Crime', 'Drama', 'Thriller'],
 ['Crime', 'Drama', 'Thriller'],
 ['Drama', 'Romance'],
 ['Mystery', 'Thriller'],
 ['Mystery', 'Thriller', 'Western'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Comedy', 'Family'],
 ['Biography', 'Comedy', 'Drama'],
 ['Drama'],
 ['Drama', 'Western'],
 ['Drama', 'Mystery', 'Romance'],
 ['Comedy', 'Drama'],
 ['Action', 'Drama', 'Mystery'],
 ['Comedy'],
 ['Action', 'Adventure', 'Crime'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Adventure', 'Sci-Fi', 'Thriller'],
 ['Drama'],
 ['Action', 'Crime', 'Drama'],
 ['Drama', 'Horror', 'Mystery'],
 ['Action', 'Horror', 'Sci-Fi'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Comedy', 'Drama', 'Romance'],
 ['Action', 'Comedy', 'Fantasy'],
 ['Action', 'Comedy', 'Mystery'],
 ['Thriller', 'War'],
 ['Action', 'Comedy', 'Crime'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Adventure', 'Crime'],
 ['Action', 'Adventure', 'Thriller'],
 ['Drama', 'Fantasy', 'Romance'],
 ['Action', 'Adventure', 'Comedy'],
 ['Biography', 'Drama', 'History'],
 ['Action', 'Drama', 'History'],
 ['Action', 'Adventure', 'Thriller'],
 ['Crime', 'Drama', 'Thriller'],
 ['Animation', 'Adventure', 'Family'],
 ['Adventure', 'Horror'],
 ['Drama', 'Romance', 'Sci-Fi'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Adventure', 'Family'],
 ['Action', 'Adventure', 'Drama'],
 ['Action', 'Comedy'],
 ['Horror', 'Mystery', 'Thriller'],
 ['Action', 'Adventure', 'Comedy'],
 ['Comedy', 'Romance'],
 ['Horror', 'Mystery'],
 ['Drama', 'Family', 'Fantasy'],
 ['Sci-Fi'],
 ['Drama', 'Thriller'],
 ['Drama', 'Romance'],
 ['Drama', 'War'],
 ['Drama', 'Fantasy', 'Horror'],
 ['Crime', 'Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Drama', 'Romance'],
 ['Drama'],
 ['Crime', 'Drama', 'History'],
 ['Horror', 'Sci-Fi', 'Thriller'],
 ['Action', 'Drama', 'Sport'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Crime', 'Drama', 'Thriller'],
 ['Adventure', 'Biography', 'Drama'],
 ['Biography', 'Drama', 'Thriller'],
 ['Action', 'Comedy', 'Crime'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Drama', 'Fantasy', 'Horror'],
 ['Biography', 'Drama', 'Thriller'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Adventure', 'Mystery'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Drama', 'Horror'],
 ['Comedy', 'Drama', 'Romance'],
 ['Comedy', 'Romance'],
 ['Drama', 'Horror', 'Thriller'],
 ['Action', 'Adventure', 'Drama'],
 ['Drama'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Drama', 'Mystery'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Adventure', 'Comedy'],
 ['Drama', 'Horror'],
 ['Action', 'Comedy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Horror', 'Mystery'],
 ['Crime', 'Drama', 'Mystery'],
 ['Comedy', 'Crime'],
 ['Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Adventure', 'Family'],
 ['Horror', 'Sci-Fi', 'Thriller'],
 ['Drama', 'Fantasy', 'War'],
 ['Crime', 'Drama', 'Thriller'],
 ['Action', 'Adventure', 'Drama'],
 ['Action', 'Adventure', 'Thriller'],
 ['Action', 'Adventure', 'Drama'],
 ['Drama', 'Romance'],
 ['Biography', 'Drama', 'History'],
 ['Drama', 'Horror', 'Thriller'],
 ['Adventure', 'Comedy', 'Drama'],
 ['Action', 'Adventure', 'Romance'],
 ['Action', 'Drama', 'War'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Drama', 'Musical', 'Romance'],
 ['Drama', 'Sci-Fi', 'Thriller'],
 ['Comedy', 'Drama'],
 ['Action', 'Comedy', 'Crime'],
 ['Biography', 'Comedy', 'Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Drama', 'Thriller'],
 ['Biography', 'Drama', 'History'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Horror', 'Mystery', 'Thriller'],
 ['Comedy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Drama', 'Sci-Fi'],
 ['Horror'],
 ['Drama', 'Thriller'],
 ['Comedy', 'Drama', 'Romance'],
 ['Drama', 'Thriller'],
 ['Comedy', 'Drama'],
 ['Drama'],
 ['Action', 'Adventure', 'Comedy'],
 ['Drama', 'Horror', 'Thriller'],
 ['Comedy'],
 ['Drama', 'Sci-Fi'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Horror'],
 ['Action', 'Adventure', 'Thriller'],
 ['Adventure', 'Fantasy'],
 ['Action', 'Comedy', 'Crime'],
 ['Comedy', 'Drama', 'Music'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Adventure', 'Mystery'],
 ['Action', 'Comedy', 'Crime'],
 ['Crime', 'Drama', 'History'],
 ['Comedy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Crime', 'Mystery', 'Thriller'],
 ['Action', 'Adventure', 'Crime'],
 ['Thriller'],
 ['Biography', 'Drama', 'Romance'],
 ['Action', 'Adventure'],
 ['Action', 'Fantasy'],
 ['Action', 'Comedy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Comedy', 'Crime'],
 ['Thriller'],
 ['Action', 'Drama', 'Horror'],
 ['Comedy', 'Music', 'Romance'],
 ['Comedy'],
 ['Drama'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Drama', 'Romance'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Comedy', 'Drama'],
 ['Biography', 'Crime', 'Drama'],
 ['Drama', 'History'],
 ['Action', 'Crime', 'Thriller'],
 ['Action', 'Biography', 'Drama'],
 ['Horror'],
 ['Comedy', 'Romance'],
 ['Comedy', 'Romance'],
 ['Comedy', 'Crime', 'Drama'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Crime', 'Drama', 'Thriller'],
 ['Action', 'Crime', 'Thriller'],
 ['Comedy', 'Romance'],
 ['Biography', 'Drama', 'Sport'],
 ['Drama', 'Romance'],
 ['Drama', 'Horror'],
 ['Adventure', 'Fantasy'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Action', 'Drama', 'Sci-Fi'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Horror'],
 ['Comedy', 'Horror', 'Thriller'],
 ['Action', 'Crime', 'Thriller'],
 ['Crime', 'Drama', 'Music'],
 ['Drama'],
 ['Action', 'Crime', 'Thriller'],
 ['Action', 'Sci-Fi', 'Thriller'],
 ['Biography', 'Drama'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Drama', 'Horror', 'Sci-Fi'],
 ['Biography', 'Comedy', 'Drama'],
 ['Crime', 'Horror', 'Thriller'],
 ['Crime', 'Drama', 'Mystery'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Biography', 'Drama'],
 ['Biography', 'Drama'],
 ['Biography', 'Drama', 'History'],
 ['Action', 'Biography', 'Drama'],
 ['Drama', 'Fantasy', 'Horror'],
 ['Comedy', 'Drama', 'Romance'],
 ['Drama', 'Sport'],
 ['Drama', 'Romance'],
 ['Comedy', 'Romance'],
 ['Action', 'Crime', 'Thriller'],
 ['Action', 'Crime', 'Drama'],
 ['Action', 'Drama', 'Thriller'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Action', 'Adventure'],
 ['Action', 'Adventure', 'Romance'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Crime', 'Drama'],
 ['Comedy', 'Horror'],
 ['Comedy', 'Fantasy', 'Romance'],
 ['Drama'],
 ['Drama'],
 ['Comedy', 'Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Adventure', 'Sci-Fi', 'Thriller'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Comedy', 'Drama'],
 ['Biography', 'Drama', 'Romance'],
 ['Comedy', 'Fantasy'],
 ['Comedy', 'Drama', 'Fantasy'],
 ['Comedy'],
 ['Horror', 'Thriller'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Adventure', 'Comedy', 'Horror'],
 ['Comedy', 'Mystery'],
 ['Drama'],
 ['Adventure', 'Drama', 'Fantasy'],
 ['Drama', 'Sport'],
 ['Action', 'Adventure'],
 ['Action', 'Adventure', 'Drama'],
 ['Action', 'Drama', 'Sci-Fi'],
 ['Action', 'Mystery', 'Sci-Fi'],
 ['Action', 'Crime', 'Drama'],
 ['Action', 'Crime', 'Fantasy'],
 ['Biography', 'Comedy', 'Drama'],
 ['Action', 'Crime', 'Thriller'],
 ['Biography', 'Crime', 'Drama'],
 ['Drama', 'Sport'],
 ['Adventure', 'Comedy', 'Drama'],
 ['Action', 'Adventure', 'Thriller'],
 ['Comedy', 'Fantasy', 'Horror'],
 ['Drama', 'Sport'],
 ['Horror', 'Thriller'],
 ['Drama', 'History', 'Thriller'],
 ['Animation', 'Action', 'Adventure'],
 ['Action', 'Adventure', 'Drama'],
 ['Action', 'Comedy', 'Family'],
 ['Action', 'Adventure', 'Drama'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Comedy'],
 ['Action', 'Crime', 'Drama'],
 ['Biography', 'Drama'],
 ['Comedy', 'Romance'],
 ['Comedy'],
 ['Drama', 'Fantasy', 'Romance'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Comedy'],
 ['Comedy', 'Sci-Fi'],
 ['Comedy', 'Drama'],
 ['Animation', 'Action', 'Adventure'],
 ['Horror'],
 ['Action', 'Biography', 'Crime'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Drama', 'Romance'],
 ['Drama', 'Mystery', 'Thriller'],
 ['Drama', 'History', 'Thriller'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Adventure', 'Comedy'],
 ['Action', 'Thriller'],
 ['Comedy', 'Music'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Crime', 'Drama', 'Thriller'],
 ['Action', 'Adventure', 'Crime'],
 ['Comedy', 'Drama', 'Horror'],
 ['Drama'],
 ['Drama', 'Mystery', 'Romance'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Drama'],
 ['Action', 'Drama', 'Thriller'],
 ['Drama'],
 ['Action', 'Horror', 'Romance'],
 ['Action', 'Drama', 'Fantasy'],
 ['Action', 'Crime', 'Drama'],
 ['Drama', 'Fantasy', 'Romance'],
 ['Action', 'Crime', 'Thriller'],
 ['Action', 'Mystery', 'Thriller'],
 ['Horror', 'Mystery', 'Thriller'],
 ['Action', 'Horror', 'Sci-Fi'],
 ['Comedy', 'Drama'],
 ['Comedy'],
 ['Action', 'Adventure', 'Horror'],
 ['Action', 'Adventure', 'Thriller'],
 ['Action', 'Crime', 'Drama'],
 ['Comedy', 'Crime', 'Drama'],
 ['Drama', 'Romance'],
 ['Drama', 'Thriller'],
 ['Action', 'Comedy', 'Crime'],
 ['Comedy'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Drama', 'Romance'],
 ['Animation', 'Family', 'Fantasy'],
 ['Drama', 'Romance'],
 ['Thriller'],
 ['Adventure', 'Horror', 'Mystery'],
 ['Action', 'Sci-Fi'],
 ['Adventure', 'Comedy', 'Drama'],
 ['Animation', 'Action', 'Adventure'],
 ['Drama', 'Horror'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Comedy', 'Drama'],
 ['Action', 'Horror', 'Mystery'],
 ['Action', 'Thriller'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Comedy', 'Crime'],
 ['Comedy', 'Romance'],
 ['Drama', 'Romance'],
 ['Crime', 'Drama', 'Thriller'],
 ['Horror', 'Mystery', 'Thriller'],
 ['Biography', 'Drama'],
 ['Drama', 'Mystery', 'Sci-Fi'],
 ['Adventure', 'Comedy', 'Family'],
 ['Action', 'Adventure', 'Crime'],
 ['Action', 'Crime', 'Mystery'],
 ['Mystery', 'Thriller'],
 ['Action', 'Sci-Fi', 'Thriller'],
 ['Action', 'Comedy', 'Crime'],
 ['Biography', 'Crime', 'Drama'],
 ['Biography', 'Drama', 'History'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Biography', 'Drama', 'History'],
 ['Biography', 'Comedy', 'Drama'],
 ['Drama', 'Thriller'],
 ['Horror', 'Thriller'],
 ['Drama'],
 ['Drama', 'War'],
 ['Comedy', 'Drama', 'Romance'],
 ['Drama', 'Romance', 'Sci-Fi'],
 ['Action', 'Crime', 'Drama'],
 ['Comedy', 'Drama'],
 ['Animation', 'Action', 'Adventure'],
 ['Adventure', 'Comedy', 'Drama'],
 ['Comedy', 'Drama', 'Family'],
 ['Drama', 'Romance', 'Thriller'],
 ['Comedy', 'Crime', 'Drama'],
 ['Animation', 'Comedy', 'Family'],
 ['Drama', 'Horror', 'Sci-Fi'],
 ['Action', 'Adventure', 'Drama'],
 ['Action', 'Horror', 'Sci-Fi'],
 ['Action', 'Crime', 'Sport'],
 ['Drama', 'Horror', 'Sci-Fi'],
 ['Drama', 'Horror', 'Sci-Fi'],
 ['Action', 'Adventure', 'Comedy'],
 ['Mystery', 'Sci-Fi', 'Thriller'],
 ['Crime', 'Drama', 'Thriller'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Sci-Fi', 'Thriller'],
 ['Drama', 'Romance'],
 ['Crime', 'Drama', 'Thriller'],
 ['Comedy', 'Drama', 'Music'],
 ['Drama', 'Fantasy', 'Romance'],
 ['Crime', 'Drama', 'Thriller'],
 ['Crime', 'Drama', 'Thriller'],
 ['Comedy', 'Drama', 'Romance'],
 ['Comedy', 'Romance'],
 ['Drama', 'Sci-Fi', 'Thriller'],
 ['Drama', 'War'],
 ['Action', 'Crime', 'Drama'],
 ['Sci-Fi', 'Thriller'],
 ['Adventure', 'Drama', 'Horror'],
 ['Comedy', 'Drama', 'Music'],
 ['Comedy', 'Drama', 'Romance'],
 ['Action', 'Adventure', 'Drama'],
 ['Action', 'Crime', 'Drama'],
 ['Adventure', 'Fantasy'],
 ['Drama', 'Romance'],
 ['Biography', 'History', 'Thriller'],
 ['Crime', 'Drama', 'Thriller'],
 ['Action', 'Drama', 'History'],
 ['Biography', 'Comedy', 'Drama'],
 ['Crime', 'Drama', 'Thriller'],
 ['Action', 'Biography', 'Drama'],
 ['Action', 'Drama', 'Sci-Fi'],
 ['Adventure', 'Horror'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Adventure', 'Mystery'],
 ['Comedy', 'Drama', 'Romance'],
 ['Horror', 'Thriller'],
 ['Action', 'Sci-Fi', 'Thriller'],
 ['Action', 'Sci-Fi', 'Thriller'],
 ['Biography', 'Drama'],
 ['Action', 'Crime', 'Drama'],
 ['Action', 'Crime', 'Mystery'],
 ['Action', 'Adventure', 'Comedy'],
 ['Crime', 'Drama', 'Thriller'],
 ['Crime', 'Drama'],
 ['Mystery', 'Thriller'],
 ['Mystery', 'Sci-Fi', 'Thriller'],
 ['Action', 'Mystery', 'Sci-Fi'],
 ['Drama', 'Romance'],
 ['Drama', 'Thriller'],
 ['Drama', 'Mystery', 'Sci-Fi'],
 ['Comedy', 'Drama'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Biography', 'Drama', 'Sport'],
 ['Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Biography', 'Drama', 'Romance'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Drama', 'Sci-Fi', 'Thriller'],
 ['Drama', 'Romance', 'Thriller'],
 ['Mystery', 'Thriller'],
 ['Mystery', 'Thriller'],
 ['Action', 'Drama', 'Fantasy'],
 ['Action', 'Adventure', 'Biography'],
 ['Adventure', 'Comedy', 'Sci-Fi'],
 ['Action', 'Adventure', 'Thriller'],
 ['Fantasy', 'Horror'],
 ['Horror', 'Mystery'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Adventure', 'Drama'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Comedy', 'Drama'],
 ['Comedy', 'Drama'],
 ['Crime', 'Drama', 'Thriller'],
 ['Comedy', 'Romance'],
 ['Animation', 'Comedy', 'Family'],
 ['Comedy', 'Drama'],
 ['Comedy', 'Drama'],
 ['Biography', 'Drama', 'Sport'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Drama', 'History'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Adventure', 'Mystery'],
 ['Crime', 'Drama', 'Mystery'],
 ['Action'],
 ['Action', 'Adventure', 'Family'],
 ['Comedy', 'Romance'],
 ['Comedy', 'Drama', 'Romance'],
 ['Biography', 'Drama', 'Sport'],
 ['Action', 'Fantasy', 'Thriller'],
 ['Biography', 'Drama', 'Sport'],
 ['Action', 'Drama', 'Fantasy'],
 ['Adventure', 'Sci-Fi', 'Thriller'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Drama', 'Mystery', 'Thriller'],
 ['Drama', 'Romance'],
 ['Crime', 'Drama', 'Mystery'],
 ['Comedy', 'Romance', 'Sport'],
 ['Comedy', 'Family'],
 ['Drama', 'Horror', 'Mystery'],
 ['Action', 'Drama', 'Sport'],
 ['Action', 'Adventure', 'Comedy'],
 ['Drama', 'Mystery', 'Sci-Fi'],
 ['Animation', 'Action', 'Comedy'],
 ['Action', 'Crime', 'Drama'],
 ['Action', 'Crime', 'Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Animation', 'Action', 'Adventure'],
 ['Crime', 'Drama'],
 ['Drama'],
 ['Drama'],
 ['Comedy', 'Crime'],
 ['Drama'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Drama', 'Fantasy', 'Romance'],
 ['Comedy', 'Drama'],
 ['Drama', 'Fantasy', 'Thriller'],
 ['Biography', 'Crime', 'Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Action', 'Crime', 'Drama'],
 ['Sci-Fi'],
 ['Action', 'Biography', 'Drama'],
 ['Action', 'Comedy', 'Romance'],
 ['Adventure', 'Comedy', 'Drama'],
 ['Comedy', 'Crime', 'Drama'],
 ['Action', 'Fantasy', 'Horror'],
 ['Drama', 'Horror'],
 ['Horror'],
 ['Action', 'Thriller'],
 ['Action', 'Adventure', 'Mystery'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Comedy', 'Drama', 'Romance'],
 ['Crime', 'Drama', 'Mystery'],
 ['Adventure', 'Comedy', 'Family'],
 ['Comedy', 'Drama', 'Romance'],
 ['Comedy'],
 ['Comedy', 'Drama', 'Horror'],
 ['Drama', 'Horror', 'Thriller'],
 ['Animation', 'Adventure', 'Family'],
 ['Comedy', 'Romance'],
 ['Mystery', 'Romance', 'Sci-Fi'],
 ['Crime', 'Drama'],
 ['Drama', 'Horror', 'Mystery'],
 ['Comedy'],
 ['Biography', 'Drama'],
 ['Comedy', 'Drama', 'Thriller'],
 ['Comedy', 'Western'],
 ['Drama', 'History', 'War'],
 ['Drama', 'Horror', 'Sci-Fi'],
 ['Drama'],
 ['Comedy', 'Drama'],
 ['Fantasy', 'Horror', 'Thriller'],
 ['Drama', 'Romance'],
 ['Action', 'Comedy', 'Fantasy'],
 ['Drama', 'Horror', 'Musical'],
 ['Crime', 'Drama', 'Mystery'],
 ['Horror', 'Mystery', 'Thriller'],
 ['Comedy', 'Music'],
 ['Drama'],
 ['Biography', 'Crime', 'Drama'],
 ['Drama'],
 ['Action', 'Adventure', 'Comedy'],
 ['Crime', 'Drama', 'Mystery'],
 ['Drama'],
 ['Action', 'Comedy', 'Crime'],
 ['Comedy', 'Drama', 'Romance'],
 ['Crime', 'Drama', 'Mystery'],
 ['Action', 'Comedy', 'Crime'],
 ['Drama'],
 ['Drama', 'Romance'],
 ['Crime', 'Drama', 'Mystery'],
 ['Adventure', 'Comedy', 'Romance'],
 ['Comedy', 'Crime', 'Drama'],
 ['Adventure', 'Drama', 'Thriller'],
 ['Biography', 'Crime', 'Drama'],
 ['Crime', 'Drama', 'Thriller'],
 ['Drama', 'History', 'Thriller'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Comedy'],
 ['Horror'],
 ['Action', 'Crime', 'Mystery'],
 ['Comedy', 'Romance'],
 ['Comedy'],
 ['Action', 'Drama', 'Thriller'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Drama', 'Mystery', 'Thriller'],
 ['Comedy', 'Drama', 'Romance'],
 ['Action', 'Fantasy', 'Horror'],
 ['Drama', 'Romance'],
 ['Biography', 'Drama'],
 ['Biography', 'Drama'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Drama', 'Mystery', 'Thriller'],
 ['Action', 'Horror', 'Sci-Fi'],
 ['Drama', 'Romance'],
 ['Biography', 'Drama'],
 ['Action', 'Adventure', 'Drama'],
 ['Adventure', 'Drama', 'Fantasy'],
 ['Drama', 'Family'],
 ['Comedy', 'Drama', 'Romance'],
 ['Drama', 'Romance', 'Sci-Fi'],
 ['Action', 'Adventure', 'Thriller'],
 ['Comedy', 'Romance'],
 ['Crime', 'Drama', 'Horror'],
 ['Comedy', 'Fantasy'],
 ['Action', 'Comedy', 'Crime'],
 ['Adventure', 'Drama', 'Romance'],
 ['Action', 'Crime', 'Drama'],
 ['Crime', 'Horror', 'Thriller'],
 ['Romance', 'Sci-Fi', 'Thriller'],
 ['Comedy', 'Drama', 'Romance'],
 ['Crime', 'Drama'],
 ['Crime', 'Drama', 'Mystery'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Animation', 'Fantasy'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Drama', 'Mystery', 'War'],
 ['Comedy', 'Romance'],
 ['Animation', 'Comedy', 'Family'],
 ['Comedy'],
 ['Horror', 'Mystery', 'Thriller'],
 ['Action', 'Adventure', 'Drama'],
 ['Comedy'],
 ['Drama'],
 ['Adventure', 'Biography', 'Drama'],
 ['Comedy'],
 ['Horror', 'Thriller'],
 ['Action', 'Drama', 'Family'],
 ['Comedy', 'Fantasy', 'Horror'],
 ['Comedy', 'Romance'],
 ['Drama', 'Mystery', 'Romance'],
 ['Action', 'Adventure', 'Comedy'],
 ['Thriller'],
 ['Comedy'],
 ['Adventure', 'Comedy', 'Sci-Fi'],
 ['Comedy', 'Drama', 'Fantasy'],
 ['Mystery', 'Thriller'],
 ['Comedy', 'Drama'],
 ['Adventure', 'Drama', 'Family'],
 ['Horror', 'Thriller'],
 ['Action', 'Drama', 'Romance'],
 ['Drama', 'Romance'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Comedy'],
 ['Action', 'Biography', 'Drama'],
 ['Drama', 'Mystery', 'Romance'],
 ['Adventure', 'Drama', 'Western'],
 ['Drama', 'Music', 'Romance'],
 ['Comedy', 'Romance', 'Western'],
 ['Thriller'],
 ['Comedy', 'Drama', 'Romance'],
 ['Horror', 'Thriller'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Crime', 'Drama', 'Mystery'],
 ['Horror', 'Mystery'],
 ['Comedy', 'Crime', 'Drama'],
 ['Action', 'Comedy', 'Romance'],
 ['Biography', 'Drama', 'History'],
 ['Adventure', 'Drama'],
 ['Drama', 'Thriller'],
 ['Drama'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Biography', 'Drama'],
 ['Drama', 'Music'],
 ['Comedy', 'Drama'],
 ['Drama', 'Thriller', 'War'],
 ['Action', 'Mystery', 'Thriller'],
 ['Horror', 'Sci-Fi', 'Thriller'],
 ['Comedy', 'Drama', 'Romance'],
 ['Action', 'Sci-Fi'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Drama', 'Mystery', 'Romance'],
 ['Drama'],
 ['Action', 'Adventure', 'Thriller'],
 ['Action', 'Crime', 'Thriller'],
 ['Animation', 'Action', 'Adventure'],
 ['Drama', 'Fantasy', 'Mystery'],
 ['Drama', 'Sci-Fi'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Horror', 'Thriller'],
 ['Action', 'Thriller'],
 ['Comedy'],
 ['Biography', 'Drama'],
 ['Action', 'Mystery', 'Thriller'],
 ['Action', 'Mystery', 'Sci-Fi'],
 ['Crime', 'Drama', 'Thriller'],
 ['Comedy', 'Romance'],
 ['Comedy', 'Drama', 'Romance'],
 ['Biography', 'Drama', 'Thriller'],
 ['Drama'],
 ['Action', 'Adventure', 'Family'],
 ['Animation', 'Comedy', 'Family'],
 ['Action', 'Crime', 'Drama'],
 ['Comedy'],
 ['Comedy', 'Crime', 'Thriller'],
 ['Comedy', 'Romance'],
 ['Animation', 'Comedy', 'Drama'],
 ['Action', 'Crime', 'Thriller'],
 ['Comedy', 'Romance'],
 ['Adventure', 'Biography', 'Drama'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Crime', 'Drama', 'Mystery'],
 ['Action', 'Comedy', 'Sci-Fi'],
 ['Comedy', 'Fantasy', 'Horror'],
 ['Comedy', 'Crime'],
 ['Animation', 'Action', 'Adventure'],
 ['Action', 'Drama', 'Thriller'],
 ['Fantasy', 'Horror'],
 ['Crime', 'Drama', 'Thriller'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Comedy', 'Drama', 'Romance'],
 ['Biography', 'Drama', 'Romance'],
 ['Action', 'Drama', 'History'],
 ['Action', 'Adventure', 'Comedy'],
 ['Horror', 'Thriller'],
 ['Horror', 'Mystery', 'Thriller'],
 ['Comedy', 'Romance'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Crime', 'Drama', 'Mystery'],
 ['Crime', 'Drama', 'Mystery'],
 ['Adventure', 'Biography', 'Drama'],
 ['Horror', 'Mystery', 'Thriller'],
 ['Horror', 'Thriller'],
 ['Drama', 'Romance', 'War'],
 ['Adventure', 'Fantasy', 'Mystery'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Biography', 'Drama'],
 ['Drama', 'Thriller'],
 ['Horror', 'Thriller'],
 ['Drama', 'Horror', 'Thriller'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Horror', 'Thriller'],
 ['Comedy'],
 ['Drama', 'Sport'],
 ['Comedy', 'Family'],
 ['Drama', 'Romance'],
 ['Action', 'Adventure', 'Comedy'],
 ['Comedy'],
 ['Mystery', 'Romance', 'Thriller'],
 ['Crime', 'Drama'],
 ['Action', 'Comedy'],
 ['Crime', 'Drama', 'Mystery'],
 ['Biography', 'Drama', 'Romance'],
 ['Comedy', 'Crime'],
 ['Drama', 'Thriller'],
 ['Drama'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Action', 'Thriller'],
 ['Drama', 'Thriller'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Crime', 'Drama', 'Mystery'],
 ['Thriller'],
 ['Biography', 'Drama', 'Sport'],
 ['Crime', 'Drama', 'Thriller'],
 ['Drama', 'Music'],
 ['Crime', 'Drama', 'Thriller'],
 ['Drama', 'Romance'],
 ['Animation', 'Action', 'Adventure'],
 ['Comedy', 'Drama'],
 ['Action', 'Adventure', 'Drama'],
 ['Biography', 'Crime', 'Drama'],
 ['Horror'],
 ['Biography', 'Drama', 'Mystery'],
 ['Drama', 'Romance'],
 ['Animation', 'Drama', 'Romance'],
 ['Comedy', 'Family'],
 ['Drama'],
 ['Mystery', 'Thriller'],
 ['Drama', 'Fantasy', 'Horror'],
 ['Drama', 'Romance'],
 ['Biography', 'Drama', 'History'],
 ['Comedy', 'Family'],
 ['Action', 'Adventure', 'Thriller'],
 ['Comedy', 'Drama'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Thriller'],
 ['Drama', 'Romance'],
 ['Comedy', 'Drama', 'Romance'],
 ['Drama', 'Horror', 'Sci-Fi'],
 ['Comedy', 'Horror', 'Romance'],
 ['Drama'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Action', 'Adventure', 'Drama'],
 ['Biography', 'Comedy', 'Drama'],
 ['Drama', 'Mystery', 'Romance'],
 ['Animation', 'Adventure', 'Comedy'],
 ['Drama', 'Romance', 'Sci-Fi'],
 ['Drama'],
 ['Drama', 'Fantasy'],
 ['Drama', 'Romance'],
 ['Comedy', 'Horror', 'Thriller'],
 ['Comedy', 'Drama', 'Romance'],
 ['Crime', 'Drama'],
 ['Comedy', 'Romance'],
 ['Action', 'Drama', 'Family'],
 ['Comedy', 'Drama', 'Romance'],
 ['Action', 'Thriller', 'War'],
 ['Action', 'Comedy', 'Horror'],
 ['Biography', 'Drama', 'Sport'],
 ['Adventure', 'Comedy', 'Drama'],
 ['Comedy', 'Romance'],
 ['Comedy', 'Romance'],
 ['Comedy', 'Drama', 'Romance'],
 ['Action', 'Adventure', 'Crime'],
 ['Comedy', 'Romance'],
 ['Animation', 'Action', 'Adventure'],
 ['Action', 'Crime', 'Sci-Fi'],
 ['Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Crime', 'Thriller'],
 ['Comedy', 'Horror', 'Sci-Fi'],
 ['Drama', 'Thriller'],
 ['Drama', 'Fantasy', 'Horror'],
 ['Thriller'],
 ['Adventure', 'Drama', 'Family'],
 ['Mystery', 'Sci-Fi', 'Thriller'],
 ['Biography', 'Crime', 'Drama'],
 ['Drama', 'Fantasy', 'Horror'],
 ['Action', 'Adventure', 'Thriller'],
 ['Crime', 'Drama', 'Horror'],
 ['Crime', 'Drama', 'Fantasy'],
 ['Adventure', 'Family', 'Fantasy'],
 ['Action', 'Adventure', 'Drama'],
 ['Action', 'Comedy', 'Horror'],
 ['Comedy', 'Drama', 'Family'],
 ['Action', 'Thriller'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Adventure', 'Drama', 'Fantasy'],
 ['Drama'],
 ['Drama'],
 ['Comedy'],
 ['Drama'],
 ['Comedy', 'Drama', 'Music'],
 ['Drama', 'Fantasy', 'Music'],
 ['Drama'],
 ['Thriller'],
 ['Comedy', 'Horror'],
 ['Action', 'Comedy', 'Sport'],
 ['Horror'],
 ['Comedy', 'Drama'],
 ['Action', 'Drama', 'Thriller'],
 ['Drama', 'Romance'],
 ['Horror', 'Mystery'],
 ['Adventure', 'Drama', 'Fantasy'],
 ['Thriller'],
 ['Comedy', 'Romance'],
 ['Action', 'Sci-Fi', 'Thriller'],
 ['Fantasy', 'Mystery', 'Thriller'],
 ['Biography', 'Drama'],
 ['Crime', 'Drama'],
 ['Action', 'Adventure', 'Sci-Fi'],
 ['Adventure'],
 ['Comedy', 'Drama'],
 ['Comedy', 'Drama'],
 ['Comedy', 'Drama', 'Romance'],
 ['Adventure', 'Comedy', 'Drama'],
 ['Action', 'Sci-Fi', 'Thriller'],
 ['Comedy', 'Romance'],
 ['Action', 'Fantasy', 'Horror'],
 ['Crime', 'Drama', 'Thriller'],
 ['Action', 'Drama', 'Thriller'],
 ['Crime', 'Drama', 'Mystery'],
 ['Crime', 'Drama', 'Mystery'],
 ['Drama', 'Sci-Fi', 'Thriller'],
 ['Biography', 'Drama', 'History'],
 ['Crime', 'Horror', 'Thriller'],
 ['Drama'],
 ['Drama', 'Mystery', 'Thriller'],
 ['Adventure', 'Biography'],
 ['Adventure', 'Biography', 'Crime'],
 ['Action', 'Horror', 'Thriller'],
 ['Action', 'Adventure', 'Western'],
 ['Horror', 'Thriller'],
 ['Drama', 'Mystery', 'Thriller'],
 ['Comedy', 'Drama', 'Musical'],
 ['Horror', 'Mystery'],
 ['Biography', 'Drama', 'Sport'],
 ['Comedy', 'Family', 'Romance'],
 ['Drama', 'Mystery', 'Thriller'],
 ['Comedy'],
 ['Drama'],
 ['Drama', 'Thriller'],
 ['Biography', 'Drama', 'Family'],
 ['Comedy', 'Drama', 'Family'],
 ['Drama', 'Fantasy', 'Musical'],
 ['Comedy'],
 ['Adventure', 'Family'],
 ['Adventure', 'Comedy', 'Fantasy'],
 ['Horror', 'Thriller'],
 ['Drama', 'Romance'],
 ['Horror'],
 ['Biography', 'Drama', 'History'],
 ['Action', 'Adventure', 'Fantasy'],
 ['Drama', 'Family', 'Music'],
 ['Comedy', 'Drama', 'Romance'],
 ['Action', 'Adventure', 'Horror'],
 ['Comedy'],
 ['Crime', 'Drama', 'Mystery'],
 ['Horror'],
 ['Drama', 'Music', 'Romance'],
 ['Adventure', 'Comedy'],
 ['Comedy', 'Family', 'Fantasy']]
#将list拆开
movie_class = np.unique([j for i in movie_genre for j in i] )
movie_class.size
20
#统计每一个类别有几个电影
movie
RankTitleGenreDescriptionDirectorActorsYearRuntime (Minutes)RatingVotesRevenue (Millions)Metascore
01Guardians of the GalaxyAction,Adventure,Sci-FiA group of intergalactic criminals are forced ...James GunnChris Pratt, Vin Diesel, Bradley Cooper, Zoe S...20141218.1757074333.1376.0
12PrometheusAdventure,Mystery,Sci-FiFollowing clues to the origin of mankind, a te...Ridley ScottNoomi Rapace, Logan Marshall-Green, Michael Fa...20121247.0485820126.4665.0
23SplitHorror,ThrillerThree girls are kidnapped by a man with a diag...M. Night ShyamalanJames McAvoy, Anya Taylor-Joy, Haley Lu Richar...20161177.3157606138.1262.0
34SingAnimation,Comedy,FamilyIn a city of humanoid animals, a hustling thea...Christophe LourdeletMatthew McConaughey,Reese Witherspoon, Seth Ma...20161087.260545270.3259.0
45Suicide SquadAction,Adventure,FantasyA secret government agency recruits some of th...David AyerWill Smith, Jared Leto, Margot Robbie, Viola D...20161236.2393727325.0240.0
.......................................
995996Secret in Their EyesCrime,Drama,MysteryA tight-knit team of rising investigators, alo...Billy RayChiwetel Ejiofor, Nicole Kidman, Julia Roberts...20151116.227585NaN45.0
996997Hostel: Part IIHorrorThree American college students studying abroa...Eli RothLauren German, Heather Matarazzo, Bijou Philli...2007945.57315217.5446.0
997998Step Up 2: The StreetsDrama,Music,RomanceRomantic sparks occur between two dance studen...Jon M. ChuRobert Hoffman, Briana Evigan, Cassie Ventura,...2008986.27069958.0150.0
998999Search PartyAdventure,ComedyA pair of friends embark on a mission to reuni...Scot ArmstrongAdam Pally, T.J. Miller, Thomas Middleditch,Sh...2014935.64881NaN22.0
9991000Nine LivesComedy,Family,FantasyA stuffy businessman finds himself trapped ins...Barry SonnenfeldKevin Spacey, Jennifer Garner, Robbie Amell,Ch...2016875.31243519.6411.0

1000 rows × 12 columns

count = pd.DataFrame(np.zeros(shape=[1000,20],dtype="int32"),columns=movie_class)
count.head()
ActionAdventureAnimationBiographyComedyCrimeDramaFamilyFantasyHistoryHorrorMusicMusicalMysteryRomanceSci-FiSportThrillerWarWestern
000000000000000000000
100000000000000000000
200000000000000000000
300000000000000000000
400000000000000000000
#计数填表
for i in range(1000):
    count.loc[i,movie_genre[i]] = 1
count
ActionAdventureAnimationBiographyComedyCrimeDramaFamilyFantasyHistoryHorrorMusicMusicalMysteryRomanceSci-FiSportThrillerWarWestern
011000000000000010000
101000000000001010000
200000000001000000100
300101001000000000000
411000000100000000000
...............................................................
99500000110000001000000
99600000000001000000000
99700000010000100100000
99801001000000000000000
99900001001100000000000

1000 rows × 20 columns

count.head()
ActionAdventureAnimationBiographyComedyCrimeDramaFamilyFantasyHistoryHorrorMusicMusicalMysteryRomanceSci-FiSportThrillerWarWestern
011000000000000010000
101000000000001010000
200000000001000000100
300101001000000000000
411000000100000000000
count.sum().sort_values(ascending=False).plot(kind = "bar",colormap="cool")
<Axes: >

在这里插入图片描述


  • 42
    点赞
  • 34
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值