Explore a pandas Series

本文主要探讨了pandas Series的数据结构,包括其属性、操作和常见用法。通过实例解析,帮助读者深入理解和掌握Series的使用。
摘要由CSDN通过智能技术生成

Explore a pandas Series

import pandas as pd
movies = pd.read_csv('imdb_1000.csv')
movies.head()
star_ratingtitlecontent_ratinggenredurationactors_list
09.3The Shawshank RedemptionRCrime142[u'Tim Robbins', u'Morgan Freeman', u'Bob Gunt...
19.2The GodfatherRCrime175[u'Marlon Brando', u'Al Pacino', u'James Caan']
29.1The Godfather: Part IIRCrime200[u'Al Pacino', u'Robert De Niro', u'Robert Duv...
39.0The Dark KnightPG-13Action152[u'Christian Bale', u'Heath Ledger', u'Aaron E...
48.9Pulp FictionRCrime154[u'John Travolta', u'Uma Thurman', u'Samuel L....
movies.dtypes
star_rating       float64
title              object
content_rating     object
genre              object
duration            int64
actors_list        object
dtype: object
movies.genre
0          Crime
1          Crime
2          Crime
3         Action
4          Crime
         ...    
974       Comedy
975    Adventure
976       Action
977       Horror
978        Crime
Name: genre, Length: 979, dtype: object
movies.genre.describe()
count       979
unique       16
top       Drama
freq        278
Name: genre, dtype: object
movies.genre.value_counts()
Drama        278
Comedy       156
Action       136
Crime        124
Biography     77
Adventure     75
Animation     62
Horror        29
Mystery       16
Western        9
Thriller       5
Sci-Fi         5
Film-Noir      3
Family         2
History        1
Fantasy        1
Name: genre, dtype: int64
movies.genre.value_counts(normalize=True)
Drama        0.283963
Comedy       0.159346
Action       0.138917
Crime        0.126660
Biography    0.078652
Adventure    0.076609
Animation    0.063330
Horror       0.029622
Mystery      0.016343
Western      0.009193
Thriller     0.005107
Sci-Fi       0.005107
Film-Noir    0.003064
Family       0.002043
History      0.001021
Fantasy      0.001021
Name: genre, dtype: float64
type(movies.genre.value_counts())
pandas.core.series.Series
movies.genre.value_counts().head()
Drama        278
Comedy       156
Action       136
Crime        124
Biography     77
Name: genre, dtype: int64
movies.genre.unique()
array(['Crime', 'Action', 'Drama', 'Western', 'Adventure', 'Biography',
       'Comedy', 'Animation', 'Mystery', 'Horror', 'Film-Noir', 'Sci-Fi',
       'History', 'Thriller', 'Family', 'Fantasy'], dtype=object)
movies.genre.nunique()
16
pd.crosstab(movies.genre, movies.content_rating)
content_ratingAPPROVEDGGPNC-17NOT RATEDPASSEDPGPG-13RTV-MAUNRATEDX
genre
Action311041114467030
Adventure320051212317020
Animation32000302555010
Biography12101062936000
Comedy9211163232373041
Crime60017164870111
Drama123042412555143191
Family010000100000
Fantasy000000001000
Film-Noir100010000010
History000000000010
Horror2001101216051
Mystery410010126010
Sci-Fi100000013000
Thriller100000103000
Western100020213000
movies.duration.describe()
count    979.000000
mean     120.979571
std       26.218010
min       64.000000
25%      102.000000
50%      117.000000
75%      134.000000
max      242.000000
Name: duration, dtype: float64
movies.duration.mean()
120.97957099080695
movies.duration.value_counts()
112    23
113    22
102    20
101    20
129    19
       ..
180     1
177     1
168     1
166     1
64      1
Name: duration, Length: 133, dtype: int64

Bonus time

%matplotlib inline
movies.duration.plot(kind='hist')
<matplotlib.axes._subplots.AxesSubplot at 0x1cc39c5b488>

在这里插入图片描述

movies.genre.value_counts().plot(kind='bar')
<matplotlib.axes._subplots.AxesSubplot at 0x1cc3a3ea948>

在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值