NumPy入门(二)

目录

访问和删除 ndarray 中的元素及向其中插入元素

通过索引访问或修改 ndarray 中的元素

向 ndarray 中添加元素及删除其中的元素

向 ndarray 中插入值

将 ndarray 上下堆叠起来,或者左右堆叠

ndarray 切片

np.copy() 函数

一个 ndarray 对另一个 ndarray 进行切片

NumPy 内置函数

布尔型索引、集合运算和排序

布尔型索引

集合运算

排序


访问和删除 ndarray 中的元素及向其中插入元素

你已经知道如何创建各种 ndarray,现在将学习 NumPy 使我们如何有效地操纵 ndarray 中的数据。NumPy ndarray 是可变的,意味着 ndarray 中的元素在 ndarray 创建之后可以更改。NumPy ndarray 还可以切片,因此可以通过多种方式拆分 ndarray。例如,我们可以从 ndarray 中获取想要的任何子集。通常,在机器学习中,你需要使用切片拆分数据,例如将数据集拆分为训练集、交叉验证集和测试集。

通过索引访问或修改 ndarray 中的元素

1、我们首先将了解如何通过索引访问或修改 ndarray 中的元素。可以在方括号 [ ] 中添加索引来访问元素。在 NumPy 中,你可以使用正索引和负索引访问 ndarray 中的元素。正索引表示从数组的开头访问元素,负索引表示从数组的末尾访问元素。我们来看看如何访问秩为 1 的 ndarray 中的元素:

# We create a rank 1 ndarray that contains integers from 1 to 5
x = np.array([1, 2, 3, 4, 5])

# We print x
print()
print('x = ', x)
print()

# Let's access some elements with positive indices
print('This is First Element in x:', x[0]) 
print('This is Second Element in x:', x[1])
print('This is Fifth (Last) Element in x:', x[4])
print()

# Let's access the same elements with negative indices
print('This is First Element in x:', x[-5])
print('This is Second Element in x:', x[-4])
print('This is Fifth (Last) Element in x:', x[-1])

x = [1 2 3 4 5]

This is First Element in x: 1
This is Second Element in x: 2
This is Fifth (Last) Element in x: 5

This is First Element in x: 1
This is Second Element in x: 2
This is Fifth (Last) Element in x: 5

注意,要访问 ndarray 中的第一个元素,我们需要使用索引 0,而不是 1。此外注意,可以同时使用正索引和负索引访问同一个元素。正如之前提到的,正索引用于从数组的开头访问元素,负索引用于从数组的末尾访问元素

2、现在我们看看如何更改秩为 1 的 ndarray 中的元素。方法是访问要更改的元素,然后使用 = 符号分配新的值:

# We create a rank 1 ndarray that contains integers from 1 to 5
x = np.array([1, 2, 3, 4, 5])

# We print the original x
print()
print('Original:\n x = ', x)
print()

# We change the fourth element in x from 4 to 20
x[3] = 20

# We print x after it was modified 
print('Modified:\n x = ', x)

Original: x = [1 2 3 4 5]

Modified: x = [ 1 2 3 20 5]

3、同样,我们可以访问和修改秩为 2 的 ndarray 中的特定元素。要访问秩为 2 的 ndarray 中的元素,我们需要提供两个索引,格式为 [row, column]。我们来看一些示例:

# We create a 3 x 3 rank 2 ndarray that contains integers from 1 to 9
X = np.array([[1,2,3],[4,5,6],[7,8,9]])

# We print X
print()
print('X = \n', X)
print()

# Let's access some elements in X
print('This is (0,0) Element in X:', X[0,0])
print('This is (0,1) Element in X:', X[0,1])
print('This is (2,2) Element in X:', X[2,2])

X =
[[1 2 3]
 [4 5 6]
 [7 8 9]]

This is (0,0) Element in X: 1
This is (0,1) Element in X: 2
This is (2,2) Element in X: 9

注意,索引 [0, 0] 是指第一行第一列的元素。

4、可以像针对秩为 1 的 ndarray 一样修改秩为 2 的 ndarray 中的元素。我们来看一个示例:

# We create a 3 x 3 rank 2 ndarray that contains integers from 1 to 9
X = np.array([[1,2,3],[4,5,6],[7,8,9]])

# We print the original x
print()
print('Original:\n X = \n', X)
print()

# We change the (0,0) element in X from 1 to 20
X[0,0] = 20

# We print X after it was modified 
print('Modified:\n X = \n', X)

Original:
X =
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Modified:
X =
[[20 2 3]
 [ 4 5 6]
 [ 7 8 9]]

向 ndarray 中添加元素及删除其中的元素

1、我们可以使用 np.delete(ndarray, elements, axis) 函数删除元素。此函数会沿着指定的从给定 ndarray 中删除给定的元素列表。对于秩为 1 的 ndarray,不需要使用关键字 axis。对于秩为 2 的 ndarray,axis = 0 表示选择axis = 1 表示选择。我们来看一些示例:

# We create a rank 1 ndarray 
x = np.array([1, 2, 3, 4, 5])

# We create a rank 2 ndarray
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])

# We print x
print()
print('Original x = ', x)

# We delete the first and last element of x
x = np.delete(x, [0,4])

# We print x with the first and last element deleted
print()
print('Modified x = ', x)

# We print Y
print()
print('Original Y = \n', Y)

# We delete the first row of y
w = np.delete(Y, 0, axis=0)

# We delete the first and last column of y
v = np.delete(Y, [0,2], axis=1)

# We print w
print()
print('w = \n', w)

# We print v
print()
print('v = \n', v)

Original x = [1 2 3 4 5]

Modified x = [2 3 4]

Original Y =
[[1 2 3]
 [4 5 6]
 [7 8 9]]

w =
[[4 5 6]
 [7 8 9]]

v =
[[2]
 [5]
 [8]]

2、现在我们来看看如何向 ndarray 中附加值。我们可以使用 np.append(ndarray, elements, axis)函数向 ndarray 中附加值。该函数会将给定的元素列表沿着指定的附加到 ndarray 中。我们来看一些示例:

# We create a rank 1 ndarray 
x = np.array([1, 2, 3, 4, 5])

# We create a rank 2 ndarray 
Y = np.array([[1,2,3],[4,5,6]])

# We print x
print()
print('Original x = ', x)

# We append the integer 6 to x
x = np.append(x, 6)

# We print x
print()
print('x = ', x)

# We append the integer 7 and 8 to x
x = np.append(x, [7,8])

# We print x
print()
print('x = ', x)

# We print Y
print()
print('Original Y = \n', Y)

# We append a new row containing 7,8,9 to y
v = np.append(Y, [[7,8,9]], axis=0)

# We append a new column containing 9 and 10 to y
q = np.append(Y,[[9],[10]], axis=1)

# We print v
print()
print('v = \n', v)

# We print q
print()
print('q = \n', q)

Original x = [1 2 3 4 5]

x = [1 2 3 4 5 6]

x = [1 2 3 4 5 6 7 8]

Original Y =
[[1 2 3]
 [4 5 6]]

v =
[[1 2 3]
 [4 5 6]
 [7 8 9]]

q =
[[ 1 2 3 9]
 [ 4 5 6 10]]

注意,当我们将行或列附加到秩为 2 的 ndarray 中时,行或列的形状必须正确,以与秩为 2 的 ndarray 的形状相符。

向 ndarray 中插入值

我们可以使用 np.insert(ndarray, index, elements, axis) 函数向 ndarray 中插入值。此函数会将给定的元素列表沿着指定的插入到 ndarray 中,并放在给定的索引前面。我们来看一些示例:

# We create a rank 1 ndarray 
x = np.array([1, 2, 5, 6, 7])

# We create a rank 2 ndarray 
Y = np.array([[1,2,3],[7,8,9]])

# We print x
print()
print('Original x = ', x)

# We insert the integer 3 and 4 between 2 and 5 in x. 
x = np.insert(x,2,[3,4])

# We print x with the inserted elements
print()
print('x = ', x)

# We print Y
print()
print('Original Y = \n', Y)

# We insert a row between the first and last row of y
w = np.insert(Y,1,[4,5,6],axis=0)

# We insert a column full of 5s between the first and second column of y
v = np.insert(Y,1,5, axis=1)

# We print w
print()
print('w = \n', w)

# We print v
print()
print('v = \n', v)

Original x = [1 2 5 6 7]

x = [1 2 3 4 5 6 7]

Original Y =
[[1 2 3]
 [7 8 9]]

w =
[[1 2 3]
 [4 5 6]
 [7 8 9]]

v =
[[1 5 2 3]
 [7 5 8 9]]

将 ndarray 上下堆叠起来,或者左右堆叠

可以使用 np.vstack() 函数进行垂直堆叠,或使用 np.hstack() 函数进行水平堆叠。请务必注意,为了堆叠 ndarray,ndarray 的形状必须相符。我们来看一些示例:

# We create a rank 1 ndarray 
x = np.array([1,2])

# We create a rank 2 ndarray 
Y = np.array([[3,4],[5,6]])

# We print x
print()
print('x = ', x)

# We print Y
print()
print('Y = \n', Y)

# We stack x on top of Y
z = np.vstack((x,Y))

# We stack x on the right of Y. We need to reshape x in order to stack it on the right of Y. 
w = np.hstack((Y,x.reshape(2,1)))

# We print z
print()
print('z = \n', z)

# We print w
print()
print('w = \n', w)

x = [1 2]

Y =
[[3 4]
 [5 6]]

z =
[[1 2]
 [3 4]
 [5 6]]

w =
[[3 4 1]
 [5 6 2]]

ndarray 切片

正如之前提到的,我们除了能够一次访问一个元素之外,NumPy 还提供了访问 ndarray 子集的方式,称之为切片。切片方式是在方括号里用冒号 : 分隔起始和结束索引。通常,你将遇到三种类型的切片:

1. ndarray[start:end]
2. ndarray[start:]
3. ndarray[:end]

第一种方法用于选择在 start 和 end 索引之间的元素。第二种方法用于选择从 start 索引开始直到最后一个索引的所有元素。第三种方法用于选择从第一个索引开始直到 end 索引的所有元素。请注意,在第一种方法和第三种方法中,结束索引不包括在内。此外注意,因为 ndarray 可以是多维数组,在进行切片时,通常需要为数组的每个维度指定一个切片。

现在我们将查看一些示例,了解如何使用上述方法从秩为 2 的 ndarray 中选择不同的子集。

# We create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# We print X
print()
print('X = \n', X)
print()

# We select all the elements that are in the 2nd through 4th rows and in the 3rd to 5th columns
Z = X[1:4,2:5]

# We print Z
print('Z = \n', Z)

# We can select the same elements as above using method 2
W = X[1:,2:5]

# We print W
print()
print('W = \n', W)

# We select all the elements that are in the 1st through 3rd rows and in the 3rd to 5th columns
Y = X[:3,2:5]

# We print Y
print()
print('Y = \n', Y)

# We select all the elements in the 3rd row
v = X[2,:]

# We print v
print()
print('v = ', v)

# We select all the elements in the 3rd column
q = X[:,2]

# We print q
print()
print('q = ', q)

# We select all the elements in the 3rd column but return a rank 2 ndarray
R = X[:,2:3]

# We print R
print()
print('R = \n', R)

X =
[[ 0 1 2 3 4]
 [ 5 6 7 8 9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Z =
[[ 7 8 9]
 [12 13 14]
 [17 18 19]]

W =
[[ 7 8 9]
 [12 13 14]
 [17 18 19]]

Y =
[[ 2 3 4]
 [ 7 8 9]
 [12 13 14]]

v = [10 11 12 13 14]

q = [ 2 7 12 17]

R =
[[ 2]
 [ 7]
 [12]
 [17]]

注意,当我们选择第 3 列中的所有元素,即上述变量 q,切片返回一个秩为 1 的 ndarray,而不是秩为 2 的 ndarray。但是,如果以稍微不同的方式切片X,即上述变量 R,实际上可以获得秩为 2 的 ndarray。

请务必注意,如果对 ndarray 进行切片并将结果保存到新的变量中,就像之前一样,数据不会复制到新的变量中。初学者对于这一点经常比较困惑。因此,我们将深入讲解这方面的知识。

在上述示例中,当我们进行赋值时,例如:

Z = X[1:4,2:5]

原始数组 X 的切片没有复制到变量 Z 中。X 和 Z 现在只是同一个 ndarray 的两个不同名称。我们提到,切片只是创建了原始数组的一个视图也就是说,如果对 Z 做出更改,也会更改 X 中的元素。我们来看一个示例:

# We create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# We print X
print()
print('X = \n', X)
print()

# We select all the elements that are in the 2nd through 4th rows and in the 3rd to 4th columns
Z = X[1:4,2:5]

# We print Z
print()
print('Z = \n', Z)
print()

# We change the last element in Z to 555
Z[2,2] = 555

# We print X
print()
print('X = \n', X)
print()

X =
[[ 0 1 2 3 4]
 [ 5 6 7 8 9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Z =
[[ 7 8 9]
 [12 13 14]
 [17 18 19]]

X =
[[ 0 1 2 3 4]
 [ 5 6 7 8 9]
 [ 10 11 12 13 14]
 [ 15 16 17 18 555]]

可以从上述示例中清晰地看出,如果对 Z 做出更改,X 也会更改。

np.copy() 函数

但是,如果我们想创建一个新的 ndarray,其中包含切片中的值的副本,需要使用 np.copy() 函数np.copy(ndarray) 函数会创建给定 ndarray 的一个副本。此函数还可以当做方法使用,就像之前使用 reshape 函数一样。我们来看看之前的相同示例,但是现在创建数组副本。我们将 copy 同时当做函数和方法。

# We create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# We print X
print()
print('X = \n', X)
print()

# create a copy of the slice using the np.copy() function
Z = np.copy(X[1:4,2:5])

#  create a copy of the slice using the copy as a method
W = X[1:4,2:5].copy()

# We change the last element in Z to 555
Z[2,2] = 555

# We change the last element in W to 444
W[2,2] = 444

# We print X
print()
print('X = \n', X)

# We print Z
print()
print('Z = \n', Z)

# We print W
print()
print('W = \n', W)

X =
[[ 0 1 2 3 4]
 [ 5 6 7 8 9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

X =
[[ 0 1 2 3 4]
 [ 5 6 7 8 9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Z =
[[ 7 8 9]
 [ 12 13 14]
 [ 17 18 555]]

W =
[[ 7 8 9]
 [ 12 13 14]
 [ 17 18 444]]

可以清晰地看出,通过使用 copy 命令,我们创建了完全相互独立的新 ndarray。

一个 ndarray 对另一个 ndarray 进行切片

 

通常,我们会使用一个 ndarray 对另一个 ndarray 进行切片、选择或更改另一个 ndarray 的元素。我们来看一些示例:

# We create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# We create a rank 1 ndarray that will serve as indices to select elements from X
indices = np.array([1,3])

# We print X
print()
print('X = \n', X)
print()

# We print indices
print('indices = ', indices)
print()

# We use the indices ndarray to select the 2nd and 4th row of X
Y = X[indices,:]

# We use the indices ndarray to select the 2nd and 4th column of X
Z = X[:, indices]

# We print Y
print()
print('Y = \n', Y)

# We print Z
print()
print('Z = \n', Z)

X =
[[ 0 1 2 3 4]
 [ 5 6 7 8 9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

indices = [1 3]

Y =
[[ 5 6 7 8 9]
 [15 16 17 18 19]]

Z =
[[ 1 3]
 [ 6 8]
 [11 13]
 [16 18]]

NumPy 内置函数

NumPy 还提供了从 ndarray 中选择特定元素的内置函数。例如,np.diag(ndarray, k=N) 函数会以 N 定义的对角线提取元素。默认情况下,k=0,表示主对角线。k > 0 的值用于选择在主对角线之上的对角线中的元素,k < 0 的值用于选择在主对角线之下的对角线中的元素。我们来看一个示例:

# We create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(25).reshape(5, 5)

# We print X
print()
print('X = \n', X)
print()

# We print the elements in the main diagonal of X
print('z =', np.diag(X))
print()

# We print the elements above the main diagonal of X
print('y =', np.diag(X, k=1))
print()

# We print the elements below the main diagonal of X
print('w = ', np.diag(X, k=-1))

X =
[[ 0 1 2 3 4]
 [ 5 6 7 8 9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]

z = [ 0 6 12 18 24]

y = [ 1 7 13 19]

w = [ 5 11 17 23]

通常我们都会从 ndarray 中提取唯一的元素。我们可以使用 np.unique() 函数查找 ndarray 中的唯一元素。np.unique(ndarray) 函数会返回给定 ndarray 中的 唯一元素,如以下示例所示:

# Create 3 x 3 ndarray with repeated values
X = np.array([[1,2,3],[5,2,8],[1,2,3]])

# We print X
print()
print('X = \n', X)
print()

# We print the unique elements of X 
print('The unique elements in X are:',np.unique(X))

X =
[[1 2 3]
 [5 2 8]
 [1 2 3]]

The unique elements in X are: [1 2 3 5 8]

布尔型索引、集合运算和排序

布尔型索引

到目前为止,我们了解了如何使用索引进行切片以及选择 ndarray 元素。当我们知道要选择的元素的确切索引时,这些方法很有用。但是,在很多情况下,我们不知道要选择的元素的索引。例如,假设有一个 10,000 x 10,000 ndarray,其中包含从 1 到 15,000 的随机整数,我们只想选择小于 20 的整数。这时候就要用到布尔型索引,对于布尔型索引,我们将使用逻辑参数(而不是确切的索引)选择元素。我们来看一些示例:

# We create a 5 x 5 ndarray that contains integers from 0 to 24
X = np.arange(25).reshape(5, 5)

# We print X
print()
print('Original X = \n', X)
print()

# We use Boolean indexing to select elements in X:
print('The elements in X that are greater than 10:', X[X > 10])
print('The elements in X that lees than or equal to 7:', X[X <= 7])
print('The elements in X that are between 10 and 17:', X[(X > 10) & (X < 17)])

# We use Boolean indexing to assign the elements that are between 10 and 17 the value of -1
X[(X > 10) & (X < 17)] = -1

# We print X
print()
print('X = \n', X)
print()

Original X =
[[ 0 1 2 3 4]
 [ 5 6 7 8 9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]

The elements in X that are greater than 10: [11 12 13 14 15 16 17 18 19 20 21 22 23 24]
The elements in X that lees than or equal to 7: [0 1 2 3 4 5 6 7]
The elements in X that are between 10 and 17: [11 12 13 14 15 16]

X =
[[ 0 1 2 3 4]
 [ 5 6 7 8 9]
 [10 -1 -1 -1 -1]
 [-1 -1 17 18 19]
 [20 21 22 23 24]]

集合运算

除了布尔型索引之外,NumPy 还允许进行集合运算。可以用来比较 ndarray,例如查找两个 ndarray 中的相同元素。我们来看一些示例:

# We create a rank 1 ndarray
x = np.array([1,2,3,4,5])

# We create a rank 1 ndarray
y = np.array([6,7,2,8,4])

# We print x
print()
print('x = ', x)

# We print y
print()
print('y = ', y)

# We use set operations to compare x and y:
print()
print('The elements that are both in x and y:', np.intersect1d(x,y))
print('The elements that are in x that are not in y:', np.setdiff1d(x,y))
print('All the elements of x and y:',np.union1d(x,y))

x = [1 2 3 4 5]

y = [6 7 2 8 4]

The elements that are both in x and y: [2 4]
The elements that are in x that are not in y: [1 3 5]
All the elements of x and y: [1 2 3 4 5 6 7 8]

排序

我们还可以在 NumPy 中对 ndarray 进行排序。我们将了解如何使用 np.sort() 函数以不同的方式对秩为 1 和 2 的 ndarray 进行排序。和我们之前看到的其他函数一样,sort 函数也可以当做方法使用。但是,对于此函数来说,数据在内存中的存储方式有很大变化。当 np.sort() 当做函数使用时,它不会对ndarray进行就地排序,即不更改被排序的原始 ndarray。但是,如果将 sort 当做方法,ndarray.sort() 会就地排序 ndarray,即原始数组会变成排序后的数组。我们来看一些示例:

# We create an unsorted rank 1 ndarray
x = np.random.randint(1,11,size=(10,))

# We print x
print()
print('Original x = ', x)

# We sort x and print the sorted array using sort as a function.
print()
print('Sorted x (out of place):', np.sort(x))

# When we sort out of place the original array remains intact. To see this we print x again
print()
print('x after sorting:', x)

Original x = [9 6 4 4 9 4 8 4 4 7]

Sorted x (out of place): [4 4 4 4 4 6 7 8 9 9]

x after sorting: [9 6 4 4 9 4 8 4 4 7]

注意,np.sort() 会对数组进行排序,但是如果被排序的 ndarray 具有重复的值,np.sort() 将在排好序的数组中保留这些值。但是,我们可以根据需要,同时使用 sort 函数和 unique 函数仅对 x 中的唯一元素进行排序。我们来看看如何对上述 x 中的唯一元素进行排序:

# We sort x but only keep the unique elements in x
print(np.sort(np.unique(x)))

[4 6 7 8 9]

最后,我们来看看如何将 sort 当做方法,原地对 ndarray 进行排序:

# We create an unsorted rank 1 ndarray
x = np.random.randint(1,11,size=(10,))

# We print x
print()
print('Original x = ', x)

# We sort x and print the sorted array using sort as a method.
x.sort()

# When we sort in place the original array is changed to the sorted array. To see this we print x again
print()
print('x after sorting:', x)

Original x = [9 9 8 1 1 4 3 7 2 8]

x after sorting: [1 1 2 3 4 7 8 8 9 9]

在对秩为 2 的 ndarray 进行排序时,我们需要在 np.sort() 函数中指定是按行排序,还是按列排序。为此,我们可以使用关键字 axis。我们来看一些示例:

# We create an unsorted rank 2 ndarray
X = np.random.randint(1,11,size=(5,5))

# We print X
print()
print('Original X = \n', X)
print()

# We sort the columns of X and print the sorted array
print()
print('X with sorted columns :\n', np.sort(X, axis = 0))

# We sort the rows of X and print the sorted array
print()
print('X with sorted rows :\n', np.sort(X, axis = 1))

Original X =
[[6 1 7 6 3]
  [3 9 8 3 5]
  [6 5 8 9 3]
  [2 1 5 7 7]
  [9 8 1 9 8]]

X with sorted columns :
[[2 1 1 3 3]
  [3 1 5 6 3]
  [6 5 7 7 5]
  [6 8 8 9 7]
  [9 9 8 9 8]]

X with sorted rows :
[[1 3 6 6 7]
  [3 3 5 8 9]
  [3 5 6 8 9]
  [1 2 5 7 7]
  [1 8 8 9 9]]

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值