Series是DataFrame的一个子结构,把DataFrame中的某一列或者某几列单独拿出来就是一个Series结构,相当于Numpy当中ndarray
导入pandas库
import pandas as pd
我们以一个csv文件来演示Series的作用:fandango_score_comparison.csv
导入csv文件fandango_score_comparision.csv
fandango = pd.read_csv("fandango_score_comparison.csv")
1、取出一个series
①查看series的类型
series_film = fandango["FILM"]
print(type(series_film))
OUT:
<class 'pandas.core.series.Series'>
②取出一个series中的前5行
series_rt = fandango["RottenTomatoes"]
print(series_rt[:5])
OUT:
0 74
1 85
2 80
3 18
4 14
Name: RottenTomatoes, dtype: int64
2、创建一个series
from pandas import Series
film_name = series_film.values
rt_score =series_rt.values
#依据film_name给rt_score建立新的索引
series_custom = Series(rt_score, index=film_name)
①查看索引值为”Minious (2015)”和”Leviathan (2014)”的rt_score
series_custom[["Minions (2015)", "Leviathan (2014)"]]
OUT:
Minions (2015) 54
Leviathan (2014) 99
dtype: int64
②查看索引值5-10的rt_score
series_custom[5:10]
OUT:
The Water Diviner (2015) 63
Irrational Man (2015) 42
Top Five (2014) 86
Shaun the Sheep Movie (2015) 99
Love & Mercy (2015) 89
dtype: int64
3、按照索引值排序
①普通方法
original_index = series_custom.index.tolist()
sort_index = sorted(original_index)
sort_by_index = series_custom.reindex(sort_index)
print(sort_by_index[:10])
OUT:
'71 (2015) 97
5 Flights Up (2015) 52
A Little Chaos (2015) 40
A Most Violent Year (2014) 90
About Elly (2015) 97
Aloha (2015) 19
American Sniper (2015) 72
American Ultra (2015) 46
Amy (2015) 97
Annie (2014) 27
dtype: int64
②series方法
sort_series_index = series_custom.sort_index()
print(sort_series_index[:10])
OUT:
'71 (2015) 97
5 Flights Up (2015) 52
A Little Chaos (2015) 40
A Most Violent Year (2014) 90
About Elly (2015) 97
Aloha (2015) 19
American Sniper (2015) 72
American Ultra (2015) 46
Amy (2015) 97
Annie (2014) 27
dtype: int64
附上:
数据分析处理库Pandas-数据读取
数据分析处理库Pandas-数据预处理
数据分析处理库Pandas-常用函数