>>> import numpy as np
在numpy中,准备一个3行4列的数组:
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
b1, b2, b3分别为逻辑值数组,将作为数组的索引:
>>> b1 = np.array([False,True,True]) # 有2个True
>>> b2 = np.array([True,False,True,False]) # 有2个True
>>> b3 = np.array([True,False,True,True]) # 有3个True
看到下面的结果,当时我就震惊了,为什么不是获得一个2行2列的数组?
>>> a[b1,b2]
array([ 4, 10])
下面这个结果则更加是无语,居然报错了!!!
>>> a[b1,b3]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-5-84ec60c10a01> in <module>()
----> 1 a[b1,b3]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)
【猜想】逻辑值索引会首先转换为数值型索引,上面的代码相当于这样:
>>> c1 = np.array([1,2])
>>> c2 = np.array([0,2])
>>> c3 = np.array([0,2,3])
>>> a[c1,c2]
array([ 4, 10])
>>> a[c1,c3]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-10-cb6fcedc9996> in <module>()
----> 1 a[c1,c3]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)
R语言代码比较:
> a <- matrix(seq(0,11), 3, 4, byrow=T)
> a
[,1] [,2] [,3] [,4]
[1,] 0 1 2 3
[2,] 4 5 6 7
[3,] 8 9 10 11
> b1 <- c(F,T,T)
> b2 <- c(T,F,T,F)
> b3 <- c(T,F,T,T)
R语言中,用逻辑索引,老老实实的给我返回了【行列交叉位置】的值:
> a[b1,b2]
[,1] [,2]
[1,] 4 6
[2,] 8 10
> a[b1,b3]
[,1] [,2] [,3]
[1,] 4 6 7
[2,] 8 10 11
即使是用数值索引,也是返回【行列交叉位置】的值:
> c1 <- c(2,3)
> c2 <- c(1,3)
> c3 <- c(1,3,4)
> a[c1,c2]
[,1] [,2]
[1,] 4 6
[2,] 8 10
> a[c1,c3]
[,1] [,2] [,3]
[1,] 4 6 7
[2,] 8 10 11
那么问题来了,在numpy中,要怎么实现R语言中的效果呢?
这时候需要使用 np.ix_ 函数,像下面这样:
>>> a[np.ix_(b1,b2)]
array([[ 4, 6],
[ 8, 10]])
>>> a[np.ix_(b1,b3)]
array([[ 4, 6, 7],
[ 8, 10, 11]])
对于数值索引,np.ix_ 的效果也是相同的:
>>> a[np.ix_(c1,c2)]
array([[ 4, 6],
[ 8, 10]])
>>> a[np.ix_(c1,c3)]
array([[ 4, 6, 7],
[ 8, 10, 11]])