多维索引（数值索引、逻辑索引）之 numpy与R语言比较

本文链接：https://blog.csdn.net/lyghe/article/details/80403473

本文探讨了numpy与R语言在多维索引，特别是数值索引和逻辑索引上的区别。在numpy中，逻辑索引不会直接返回交叉位置的值，而R语言则按照预期返回。通过示例展示了numpy中如何利用np.ix_函数来实现类似R语言的效果，确保获取行列交叉位置的值。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

>>> import numpy as np

在numpy中，准备一个3行4列的数组：

>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

b1, b2, b3分别为逻辑值数组，将作为数组的索引：

>>> b1 = np.array([False,True,True]) # 有2个True
>>> b2 = np.array([True,False,True,False]) # 有2个True
>>> b3 = np.array([True,False,True,True]) # 有3个True

看到下面的结果，当时我就震惊了，为什么不是获得一个2行2列的数组？

>>> a[b1,b2]
array([ 4, 10])

下面这个结果则更加是无语，居然报错了！！！

>>> a[b1,b3]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-5-84ec60c10a01> in <module>()
----> 1 a[b1,b3]

IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)

【猜想】逻辑值索引会首先转换为数值型索引，上面的代码相当于这样：

>>> c1 = np.array([1,2])
>>> c2 = np.array([0,2])
>>> c3 = np.array([0,2,3])

>>> a[c1,c2]
array([ 4, 10])

>>> a[c1,c3]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-10-cb6fcedc9996> in <module>()
----> 1 a[c1,c3]

IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)

R语言代码比较：

> a <- matrix(seq(0,11), 3, 4, byrow=T)
> a
     [,1] [,2] [,3] [,4]
[1,]    0    1    2    3
[2,]    4    5    6    7
[3,]    8    9   10   11

> b1 <- c(F,T,T)
> b2 <- c(T,F,T,F)
> b3 <- c(T,F,T,T)

R语言中，用逻辑索引，老老实实的给我返回了【行列交叉位置】的值：

> a[b1,b2]
     [,1] [,2]
[1,]    4    6
[2,]    8   10

> a[b1,b3]
     [,1] [,2] [,3]
[1,]    4    6    7
[2,]    8   10   11

即使是用数值索引，也是返回【行列交叉位置】的值：

> c1 <- c(2,3)
> c2 <- c(1,3)
> c3 <- c(1,3,4)

> a[c1,c2]
     [,1] [,2]
[1,]    4    6
[2,]    8   10

> a[c1,c3]
     [,1] [,2] [,3]
[1,]    4    6    7
[2,]    8   10   11

那么问题来了，在numpy中，要怎么实现R语言中的效果呢？

这时候需要使用 np.ix_ 函数，像下面这样：

>>> a[np.ix_(b1,b2)]
array([[ 4,  6],
       [ 8, 10]])

>>> a[np.ix_(b1,b3)]
array([[ 4,  6,  7],
       [ 8, 10, 11]])

对于数值索引，np.ix_ 的效果也是相同的：

>>> a[np.ix_(c1,c2)]
array([[ 4,  6],
       [ 8, 10]])

>>> a[np.ix_(c1,c3)]
array([[ 4,  6,  7],
       [ 8, 10, 11]])

多维索引（数值索引、逻辑索引） 之 numpy与R语言比较

多维索引（数值索引、逻辑索引）之 numpy与R语言比较