欢迎来到@一夜看尽长安花 博客,您的点赞和收藏是我持续发文的动力
对于文章中出现的任何错误请大家批评指出,一定及时修改。有任何想要讨论的问题可联系我:3329759426@qq.com 。发布文章的风格因专栏而异,均自成体系,不足之处请大家指正。
专栏:
文章概述:对 Pandas之创建Series一维数组的介绍
关键词:Pandas之创建Series一维数组
本文目录:
Pandas
pandas是一个基于python编程语言,快速的、强大的、易用的开源数据分析和操作数据集的工具。
Series,一维数组,背后基于numpy
看起来像是字典,但还有序
import pandas as pd
import numpy as np
# 国家人口,单位百万
g7_pop = pd.Series([35.467, 63.951, 80.940, 60.665, 127.061, 64.511, 318.523])
g7_pop.name = 'G7 Population in millions'
print(g7_pop)
print(g7_pop.dtype)
print(g7_pop.values)
print(type(g7_pop.values))
print("----------------------")
print(g7_pop[0])
print(g7_pop[1])
print(g7_pop.index)
# 我们可以指定index
g7_pop.index = [
'canada',
'France',
'Germany',
'Italy',
'Japan',
'United Kingdom',
'United States'
]
print(g7_pop)
print("----------------------")
# 这样来看其实series看起来更像是一个dict字典,但是series是带顺序的,python中的dict字典可没有顺序
# 其实我们一开始就可以传进来index和name的
pd.Series({
'Canada': 35.467,
'France': 63.951,
'Germany': 80.940,
'Italy': 60.665,
'Japan': 127.061,
'United Kingdom': 64.511,
'United States': 318.523
}, name='G7 Population in millions')
#打印键所对应的字典集
pd.Series(g7_pop, index=['France', 'Germany', ' Italy', 'Spain'])
Pandas之Series根据索引取值
indexing
print(g7_pop)
print(g7_pop['canada'])
print(g7_pop['Japan'])
print("-------------------------------------")
# 当有了index之后,依然可以根据位置取值
print(g7_pop.iloc[0])
print(g7_pop.iloc[-1])
# multi indexing
print(g7_pop[['Italy', 'France']])
print(g7_pop.iloc[[0,1]])
print("-------------------------------------")
# 这里需要特别注意的是切片操作
l = ['a','b','c']
print(l[:2]) # 我们得不到元素c
#在pandas中左闭右闭
print(g7_pop['Canada':'Italy']) # 我们确可以得到最后的元素
Pandas之Series根据条件筛选值
operations and methods
print(g7_pop)
print(g7_pop * 1_000_000)
print(g7_pop.mean())
print(np.log(g7_pop))
print(g7_pop['France':'Italy'].mean())
conditional selection (boolean arrays)
print(g7_pop)
print(g7_pop>70) # 得到True or False
print(g7_pop[g7_pop>70]) # 选择
print(g7_pop.mean())
print(g7_pop[g7_pop>g7_pop.mean()])
print(g7_pop.std())
# ~ not
# | or
# & and
print(g7_pop[(g7_pop>80) | (g7_pop<40)])
print(g7_pop[(g7_pop>80) & (g7_pop<200)])
print(g7_pop[(g7_pop>g7_pop.mean() - g7_pop.std()/2) | (g7_pop>g7_pop.mean() + g7_pop.std()/2)])
modifying series
g7_pop['Canada'] = 40.5
print(g7_pop)
g7_pop.iloc[-1] = 500
print(g7_pop)
g7_pop[g7_pop<70] = 99.99
print(g7_pop)