pandas:ix 、loc 、 iloc区别、.at、.iat和.get_value

1 loc——通过行标签索引行数据

1.loc[1]表示索引的是第1行(index 是整数)
data = [[1,2,3],[4,5,6]]
index = [0,1]
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print (df.loc[1])
'''
a    4
b    5
c    6
Name: 1, dtype: int64
'''
2.loc[‘d’]表示索引的是第’d’行(index 是字符)
import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print (df.loc['d'])
'''
a    1
b    2
c    3
Name: d, dtype: int64
'''

使用行号索引就会出错

import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print(df.loc[1])
'''
TypeError: cannot do label indexing on <class 'pandas.core.indexes.base.Index'>
with these indexers [1] of <class 'int'>

'''
3 如果想索引列数据,像这样做会报错
import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print df.loc['a']
'''
KeyError: 'the label [a] is not in the [index]'
'''
4 loc可以获取多行数据
import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print (df.loc['d':])
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
5 loc扩展——索引某行某列
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print (df.loc['d',['b','c']])
'''
b    2
c    3
'''
6 loc扩展——索引某列
import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print (df.loc[:,['c']])
'''
   c
d  3
e  6
'''
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
     index=['cobra', 'viper', 'sidewinder'],
      columns=['max_speed', 'shield'])
print(df)
print(df.loc[['viper', 'sidewinder'],['max_speed', 'shield']])
'''
            max_speed  shield
cobra               1       2
viper               4       5
sidewinder          7       8
            max_speed  shield
viper               4       5
sidewinder          7       8

'''

当然获取某列数据最直接的方式是df.[列标签],但是当列标签未知时可以通过这种方式获取列数据。

需要注意的是,dataframe的索引[1:3]是包含1,2,3的,与平时的不同。

2. iloc——通过行号获取行数据

2.1 想要获取哪一行就输入该行数字
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print (df.iloc[1])
'''
a    4
b    5
c    6
Name: e, dtype: int64
'''
2.2 通过行标签索引会报错
import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print (df.iloc['e'])
'''
TypeError: Cannot index by location index with a non-integer key
'''
2.3 同样通过行号可以索引多行
import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print (df.iloc[0:])
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
2.4 iloc索引列数据
import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print (df.iloc[:,[1]])
'''
   b
d  2
e  5
'''

3. ix——结合前两种的混合索引

3.1 通过行号索引
import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print(df.ix[1])
'''
a    4
b    5
c    6
Name: e, dtype: int64
'''
3.2 通过行标签索引
import pandas as pd
data = [[1,2,3],[4,5,6]]
index = ['d','e']
columns=['a','b','c']
df = pd.DataFrame(data=data, index=index, columns=columns)
print(df)
'''
   a  b  c
d  1  2  3
e  4  5  6
'''
print(df.ix['e'])
'''
a    4
b    5
c    6
Name: e, dtype: int64
'''

[ ]或者__getitem__()和上面的行优先相反,他是列优先,而且不能实现多维索引。此外,还可以传入slice object(label和int都可以)和 boolean数组来筛选特定的行

  • 列名label,df[‘A’]
  • 列名列表,df[[‘A’,‘B’]]
  • slice object,df[:3],df[‘a’:‘c’]
  • boolean array,df[df.A>0.5]

[ ]切片方法
使用方括号能够对DataFrame进行切片,有点类似于python的列表切片。按照索引能够实现行选择或列选择或区块选择。

# 行选择
In [7]: data[1:5]
Out[7]: 
       fecha  rnd_1  rnd_2  rnd_3
1 2012-04-11      1     16      3
2 2012-04-12      7      6      1
3 2012-04-13      2     16      7
4 2012-04-14      4     17      7

# 列选择
In [10]: data[['rnd_1', 'rnd_3']]
Out[10]: 
     rnd_1  rnd_3
0        8     12
1        1      3
2        7      1
3        2      7
4        4      7
5       12      8


# 区块选择
In [11]: data[:7][['rnd_1', 'rnd_2']]
Out[11]: 
   rnd_1  rnd_2
0      8     17
1      1     16
2      7      6
3      2     16
4      4     17
5     12     19
6      2      7

不过对于多列选择,不能像行选择时一样使用1:5这样的方法来选择。

In [12]: data[['rnd_1':'rnd_3']]
  File "<ipython-input-13-6291b6a83eb0>", line 1
    data[['rnd_1':'rnd_3']]
                 ^
SyntaxError: invalid syntax
.a.iatlociloc类似,.aloc 、只能通过标签获取值,.iatiloc通过行号,列号获取值
df = pd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]],
                   columns=['A', 'B', 'C'])
print(df)
'''
    A   B   C
0   0   2   3
1   0   4   1
2  10  20  30
'''
print(df.at[2, 'B'])
#20

print(df.at[2, 2])
#ValueError: At based indexing on an non-integer index can only have non-integer indexers

print(df.get_value(2,'B'))
#20

print(df.get_value(2,2))
#只能根据标签索引
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值