Series与DataFrame索引、选取和过滤

Series

from pandas import Series
import numpy as np
obj = Series(np.arange(4),index=['a','b','c','d'])
obj
a    0
b    1
c    2
d    3
dtype: int64
obj['b']
1
obj[1]
1
obj[2:4]
c    2
d    3
dtype: int64
obj[['b','a','d']]
b    1
a    0
d    3
dtype: int64
obj[[1,3]]
b    1
d    3
dtype: int64
obj[obj < 2]
a    0
b    1
dtype: int64

DataFrame

索引列

from pandas import DataFrame
data = DataFrame(np.arange(16).reshape((4,4,)),
                index=['Ohio','Colorado','Utah','New York'],columns=['one','two','three','four'])
data
onetwothreefour
Ohio0123
Colorado4567
Utah891011
New York12131415
data['two']
Ohio         1
Colorado     5
Utah         9
New York    13
Name: two, dtype: int64
data[['three','one']]
threeone
Ohio20
Colorado64
Utah108
New York1412

索引行

data[:2]  # 返回行
onetwothreefour
Ohio0123
Colorado4567
data['three'] > 5
Ohio        False
Colorado     True
Utah         True
New York     True
Name: three, dtype: bool
data[data['three'] > 5]
onetwothreefour
Colorado4567
Utah891011
New York12131415
类型说明
obj[val]选取DataFrame的单个列或一组列.
在一些特殊情况下会比较便利:布尔型数组(过滤行)、切片(行切片)、布尔型DataFrame(根据条件设置值)
obj.ix[val]选取DataFrame的单个行或一组行
obj.ix[:,val]选取单个列或列子集
obj.ix[val1,val2]同时选取行和列
reindex方法将一个或多个轴匹配到新索引
xs方法根据标签选取单行或单列,并返回一个Series
icol、irow方法根据整数位置选取单行或单列,并返回一个Series
get_value,set_value方法根据行标签和列标签选取单个值
data.ix['Colorado',['two','three']]
two      5
three    6
Name: Colorado, dtype: int64
data.ix[['Colorado','Utah'],[3,0,1]]
fouronetwo
Colorado745
Utah1189
data.ix[2]  #返回列
one       8
two       9
three    10
four     11
Name: Utah, dtype: int64
data.ix[:'Utah','two']   
Ohio        1
Colorado    5
Utah        9
Name: two, dtype: int64
data.ix[data.three > 5, :3]
onetwothree
Colorado456
Utah8910
New York121314

赋值

data < 5
onetwothreefour
OhioTrueTrueTrueTrue
ColoradoTrueFalseFalseFalse
UtahFalseFalseFalseFalse
New YorkFalseFalseFalseFalse
data[data < 5] = 0
data
onetwothreefour
Ohio0000
Colorado0567
Utah891011
New York12131415

带有重复值的轴索引

from pandas import Series,DataFrame
import numpy as np
obj = Series(range(5),index=['a','a','b','b','c'])
obj
a    0
a    1
b    2
b    3
c    4
dtype: int64

index.is_unique 属性 判断索引是否唯一

obj.index.is_unique
False
obj['a']
a    0
a    1
dtype: int64
df = DataFrame(np.random.randn(4,3),index=['a','a','b','b'])
df
012
a-0.9881050.6624671.778395
a-1.0214170.4701860.754296
b0.0355190.598257-1.034743
b0.1197802.0947300.799680
df.ix['b']
012
b0.0355190.598257-1.034743
b0.1197802.0947300.799680
  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值