Numpy的索引切片
索引
In [72]: arr = np.array([[[1,1,1],[2,2,2]],[[3,3,3],[4,4,4]]]) In [73]: arr Out[73]: array([[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]]) In [74]: arr.ndim Out[74]: 3 In [75]: arr.shape Out[75]: (2, 2, 3) In [76]: arr[0] #返回降低一个维度的数组 Out[76]: array([[1, 1, 1], [2, 2, 2]])
In [77]: arr[0,0] #返回一维数组
Out[77]: array([1, 1, 1])
切片
In [78]: arr[:,:,:2] Out[78]: array([[[1, 1], [2, 2]], [[3, 3], [4, 4]]])
索引与切片结合
array([[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]])
In [79]: arr[0,1,:2]
Out[79]: array([2, 2])
Pandas的索引切片
一、Series的索引
In [8]: obj = pd.Series(np.arange(4),index=['a','b','c','d']) In [9]: obj Out[9]: a 0 b 1 c 2 d 3 dtype: int64
1)使用index进行索引
In [10]: obj['b'] Out[10]: 1 In [11]: obj[1] Out[11]: 1
2)切片
In [12]: obj['b':'d'] #包含尾部 Out[12]: b 1 c 2 d 3 dtype: int64 In [13]: obj[1:3] Out[13]: b 1 c 2 dtype: int64
3)使用索引列表进行索引
In [14]: obj[['b','d']] Out[14]: b 1 d 3 dtype: int64 In [15]: obj[[1,3]] Out[15]: b 1 d 3 dtype: int64
二、DataFrame的索引
In [20]: obj = pd.DataFrame(np.arange(16).reshape((4,4)),index=['a','b','c','d' ...: ],columns=['a1','b2','c3','d4'])
In [21]: obj Out[21]: a1 b2 c3 d4 a 0 1 2 3 b 4 5 6 7 c 8 9 10 11 d 12 13 14 15
1)索引列
不可以obj[‘a’]了
In [32]: obj['b2'] Out[32]: a 1 b 5 c 9 d 13 Name: b2, dtype: int64
2)行切片
In [36]: obj[:3] Out[36]: a1 b2 c3 d4 a 0 1 2 3 b 4 5 6 7 c 8 9 10 11 In [37]: obj[obj['c3']>6] #根据布尔值数组选择数据 Out[37]: a1 b2 c3 d4 c 8 9 10 11 d 12 13 14 15
3)索引列和行
In [38]: obj['a1']['c'] Out[38]: 8 In [39]: obj['a1'][:2] Out[39]: a 0 b 4 Name: a1, dtype: int64
4)使用loc和iloc选择数据
使用轴标签(loc)或整数标签(iloc)从DataFrame中选出数组的行和列的子集
整数标签(iloc):
In [55]: obj Out[55]: a1 b2 c3 d4 a 0 1 2 3 b 4 5 6 7 c 8 9 10 11 d 12 13 14 15 In [53]: obj.iloc[2,[2,0,1]] #变换列顺序 Out[53]: c3 10 a1 8 b2 9 Name: c, dtype: int64 In [54]: obj.iloc[2] #索引行 Out[54]: a1 8 b2 9 c3 10 d4 11 Name: c, dtype: int64
轴标签(loc):
In [57]: obj.loc['a',['b2','a1']] Out[57]: b2 1 a1 0 Name: a, dtype: int64 In [58]: obj.loc['a':'c',['b2','a1']] Out[58]: b2 a1 a 1 0 b 5 4 c 9 8