Series
Series 是一种类似与一维数组的对象,它由一组数据(各种NumPy数据类型)以及一组与之 相关的数据标签(即索引)组成
创建 Series
from pandas import Series,DataFrame
import pandas as pd
obj = Series([4,7,-5,3])
obj
0 4
1 7
2 -5
3 3
dtype: int64
索引命名
obj = Series([4,7,-5,3],index=['d','b','c','a'])
obj
d 4
b 7
c -5
a 3
dtype: int64
索引重命名
obj.index = ['Bob','Steve','Jeff','Ryan']
obj
Bob 4
Steve 7
Jeff -5
Ryan 3
dtype: int64
通过字典 创建 Series
如果只传入一个字典,则Series中的索引就是原字典的键
sdata = {'Ohio':35000,'Texas':71000,'Oregon':16000,'Utah':5000}
obj = Series(sdata)
obj
Ohio 35000
Oregon 16000
Texas 71000
Utah 5000
dtype: int64
列表作为索引与字典匹配 创建 Series (Series 可能含有缺失值NaN)
sdata = {'Ohio':35000,'Texas':71000,'Oregon':16000,'Utah':5000}
states = ['California','Ohio','Oregon','Texas']
obj = Series(sdata,index=states)
obj
California NaN
Ohio 35000.0
Oregon 16000.0
Texas 71000.0
dtype: float64
Series 属性
.values 获取数组表示形式
obj.values
array([ nan, 35000., 16000., 71000.])
.index 获取索引对象
obj.index
Index(['California', 'Ohio', 'Oregon', 'Texas'], dtype='object')
.name 对Servies对象 和 索引命名
sdata = {'Ohio':35000,'Texas':71000,'Oregon':16000,'Utah':5000}
states = ['California','Ohio','Oregon','Texas']
obj = Series(sdata,index=states)
obj.name = 'population'
obj.index.name = 'state'
obj
state
California NaN
Ohio 35000.0
Oregon 16000.0
Texas 71000.0
Name: population, dtype: float64
索引与查找
obj = Series([4,7,-5,3],index=['d','b','a','c'])
obj['a']
-5
obj[['c','a','d']]
c 3
a -5
d 4
dtype: int64
print ('b' in obj)
print ('e' in obj)
True
False
obj[obj > 2] ## 根据布尔值进行过滤
d 4
b 7
c 3
dtype: int64
pd.isnull() 检查缺失数据,有NaN值返回 True
pd.notnull() 检查缺失数据,有NaN值返回 false
import pandas as pd
sdata = {'Ohio':35000,'Texas':71000,'Oregon':16000,'Utah':5000}
states = ['California','Ohio','Oregon','Texas']
obj = Series(sdata,index=states)
print(pd.isnull(obj))
print('----------------------')
print(pd.notnull(obj))
California True
Ohio False
Oregon False
Texas False
dtype: bool
----------------------
California False
Ohio True
Oregon True
Texas True
dtype: bool
Series 运算
标量乘法
obj = Series([4,7,-5,3],index=['d','b','a','c'])
obj * 2
d 8
b 14
a -10
c 6
dtype: int64
应用数学函数
np.exp(x)返回幂运算 e x e^x ex的结果
import numpy as np
np.exp(obj)
d 54.598150
b 1096.633158
a 0.006738
c 20.085537
dtype: float64
算术运算与自动对象不同索引
print('-------------obj1---------------')
sdata = {'Ohio':35000,'Texas':71000,'Oregon':16000,'Utah':5000}
obj1 = Series(sdata)
print(obj1)
print('-------------obj2---------------')
states = ['California','Ohio','Oregon','Texas']
obj2 = Series(sdata,index=states)
print(obj2)
print('-------------obj1+obj2----------')
print(obj1 + obj2)
-------------obj1---------------
Ohio 35000
Oregon 16000
Texas 71000
Utah 5000
dtype: int64
-------------obj2---------------
California NaN
Ohio 35000.0
Oregon 16000.0
Texas 71000.0
dtype: float64
-------------obj1+obj2----------
California NaN
Ohio 70000.0
Oregon 32000.0
Texas 142000.0
Utah NaN
dtype: float64