Numpy百题

最新推荐文章于 2021-11-13 14:06:38 发布

咸鱼综合症

最新推荐文章于 2021-11-13 14:06:38 发布

阅读量656

点赞数

分类专栏：学习总结文章标签： python

本文链接：https://blog.csdn.net/qq_43497779/article/details/106826084

版权

学习总结专栏收录该内容

4 篇文章 0 订阅

订阅专栏

NumPy 百题大冲关

介绍
NumPy 是 Python 语言的一个第三方库，其支持大量高维度数组与矩阵运算。此外，NumPy 也针对数组运算提供大量的数学函数。机器学习涉及到大量对数组的变换和运算，NumPy 就成了必不可少的工具之一。NumPy 百题大冲关分为基础篇和进阶篇，每部分各有 50 道练习题。基础部分的练习题在于熟悉 NumPy 常用方法的使用，而进阶部分则侧重于 NumPy 方法的组合应用。

import numpy as np

通过列表创建二维数组：

np.array([(1, 2, 3), (4, 5, 6)])

array([[1, 2, 3],
       [4, 5, 6]])

上方数组的秩为 2。第一个维度长度为 2,第二个维度长度为 3。

创建全为 0 的二维数组：

np.zeros((3, 3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

创建全为 1 的三维数组：

np.ones((2, 3, 4))

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

创建一维等差数组：

np.arange(5)

array([0, 1, 2, 3, 4])

创建二维等差数组：

np.arange(6).reshape(2, 3)

array([[0, 1, 2],
       [3, 4, 5]])

创建单位矩阵（二维数组）：

np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

创建等间隔一维数组：

np.linspace(1, 10, num=6)

array([ 1. ,  2.8,  4.6,  6.4,  8.2, 10. ])

创建二维随机数组：

np.random.rand(2, 3)

array([[0.04176465, 0.11793032, 0.0711188 ],
       [0.16089234, 0.38944706, 0.70976536]])

12. 创建二维随机整数数组（数值小于 5）：

np.random.randint(5, size=(2, 3))

array([[4, 4, 4],
       [1, 1, 1]])

依据自定义函数创建数组：

np.fromfunction(lambda i, j: i + j, (3, 3))

array([[0., 1., 2.],
       [1., 2., 3.],
       [2., 3., 4.]])

矩阵乘法运算（注意与上题的区别）：

np.dot(A, B)

np.mat(A) * np.mat(B)

数乘矩阵

2 * A

矩阵的转置：

A.T

矩阵求逆：

np.linalg.inv(A)

数学函数

np.sin(a)

np.exp(a)

np.sqrt(a)

np.power(a, 3)

数组切片和索引

二维数组切片（取第 2，3 行）：

a[1:3, :]

数组形状操作

a.shape

a.reshape(2, 3) # reshape 并不改变原始数组

a.resize(2, 3) # resize 会改变原始数组

展平数组

垂直拼合数组：

生成示例数组

a = np.random.randint(10, size=(3, 3))
b = np.random.randint(10, size=(3, 3))
a, b
np.vstack((a, b))

array([[0, 1, 1],
       [2, 5, 6],
       [6, 6, 4],
       [9, 9, 0],
       [7, 2, 4],
       [8, 1, 9]])

水平拼合数组

np.hstack((a, b))

array([[0, 1, 1, 9, 9, 0],
       [2, 5, 6, 7, 2, 4],
       [6, 6, 4, 8, 1, 9]])

沿横轴分割数组：

np.hsplit(a, 3)

[array([[0],
        [2],
        [6]]), array([[1],
        [5],
        [6]]), array([[1],
        [6],
        [4]])]

沿纵轴分割数组：

np.vsplit(a, 3)

[array([[0, 1, 1]]), array([[2, 5, 6]]), array([[6, 6, 4]])]

数组排序

np.max(a, axis=0) #每行最大值
np.min(a, axis=1)  #每列最小值
np.argmin(a, axis=1) #每行最小索引

array([0, 0, 2], dtype=int64)

统计

np.mean(a, axis=1) #平均
np.average(a, axis=0)  #加权平均
np.var(a, axis=1)  #方差
np.std(a, axis=0) #标准差

array([2.49443826, 2.1602469 , 2.05480467])

创建一个 5x5 的二维数组，其中边界值为1，其余值为0

Z = np.ones((5, 5))
Z[1:-1, 1:-1] = 0
Z

array([[1., 1., 1., 1., 1.],
       [1., 0., 0., 0., 1.],
       [1., 0., 0., 0., 1.],
       [1., 0., 0., 0., 1.],
       [1., 1., 1., 1., 1.]])

数值小于5的随机数组

np.random.randint(5, size=(2, 3))

array([[0, 0, 2],
       [0, 0, 2]])

自定义数组

使用数字 0 将一个全为 1 的 5x5 二维数组包围：

Z = np.ones((5, 5))
Z = np.pad(Z, pad_width=1, mode='constant', constant_values=0)
Z

array([[0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])

Z = np.ones((2, 2))
Z = np.pad(Z, ((3,2)), mode='constant', constant_values=(0,2))
Z

array([[0., 0., 0., 0., 0., 2., 2.],
       [0., 0., 0., 0., 0., 2., 2.],
       [0., 0., 0., 0., 0., 2., 2.],
       [0., 0., 0., 1., 1., 2., 2.],
       [0., 0., 0., 1., 1., 2., 2.],
       [0., 0., 0., 2., 2., 2., 2.],
       [0., 0., 0., 2., 2., 2., 2.]])

Z = np.ones((2, 2))
Z = np.pad(Z, ((3,2),(2,3)), mode='constant', constant_values=(0,2))
Z

array([[0., 0., 0., 0., 2., 2., 2.],
       [0., 0., 0., 0., 2., 2., 2.],
       [0., 0., 0., 0., 2., 2., 2.],
       [0., 0., 1., 1., 2., 2., 2.],
       [0., 0., 1., 1., 2., 2., 2.],
       [0., 0., 2., 2., 2., 2., 2.],
       [0., 0., 2., 2., 2., 2., 2.]])

‘constant’——表示连续填充相同的值，每个轴可以分别指定填充值，constant_values=（x, y）时前面用x填充，后面用y填充，缺省值填充0

‘edge’——表示用边缘值填充

‘linear_ramp’——表示用边缘递减的方式填充

‘maximum’——表示最大值填充

‘mean’——表示均值填充

‘median’——表示中位数填充

‘minimum’——表示最小值填充

‘reflect’——表示对称填充

‘symmetric’——表示对称填充

‘wrap’——表示用原数组后面的值填充前面，前面的值填充后面

创建一个 5x5 的二维数组，并设置值 1, 2, 3, 4 落在其对角线下方：

Z = np.diag(1+np.arange(4), k=2)
Z

array([[0, 0, 1, 0, 0, 0],
       [0, 0, 0, 2, 0, 0],
       [0, 0, 0, 0, 3, 0],
       [0, 0, 0, 0, 0, 4],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0]])

np.datetime64(‘today’,‘s’)

找出两个一维数组中相同的元素：

Z1 = np.random.randint(0, 10, 10)
Z2 = np.random.randint(0, 10, 10)
print("Z1:", Z1)
print("Z2:", Z2)
np.intersect1d(Z1, Z2)

Z1: [9 3 1 4 1 6 8 8 2 1]
Z2: [8 1 4 8 5 3 4 5 4 8]





array([1, 3, 4, 8])

创建一个 0-10 的一维数组，并将 (1, 9] 之间的数全部反转成负数：

Z = np.arange(11)
Z[(1 < Z) & (Z <= 9)] *= -1
Z

array([ 0,  1, -2, -3, -4, -5, -6, -7, -8, -9, 10])

使用五种不同的方法去提取一个随机数组的整数部分：

Z = np.random.uniform(0, 10, 10)
print("原始值: ", Z)

print("方法 1: ", Z - Z % 1)
print("方法 2: ", np.floor(Z))
print("方法 3: ", np.ceil(Z)-1)
print("方法 4: ", Z.astype(int))
print("方法 5: ", np.trunc(Z))

原始值:  [9.19339372 5.01524321 3.9044947  9.25545812 1.02967918 8.48580265
 6.75884302 7.95318858 8.24499604 5.22026427]
方法 1:  [9. 5. 3. 9. 1. 8. 6. 7. 8. 5.]
方法 2:  [9. 5. 3. 9. 1. 8. 6. 7. 8. 5.]
方法 3:  [9. 5. 3. 9. 1. 8. 6. 7. 8. 5.]
方法 4:  [9 5 3 9 1 8 6 7 8 5]
方法 5:  [9. 5. 3. 9. 1. 8. 6. 7. 8. 5.]

创建一个 5x5 的矩阵，其中每行的数值范围从 1 到 5：

Z = np.zeros((5, 5))
Z += np.arange(1, 6)

Z

array([[1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.]])

创建一个长度为 5 的等间隔一维数组，其值域范围从 0 到 1，但是不包括 0 和 1：

Z = np.linspace(0, 1, 6, endpoint=False)[1:]

Z

array([0.16666667, 0.33333333, 0.5       , 0.66666667, 0.83333333])

创建一个长度为10的随机一维数组，并将其按升序排序：

Z = np.random.random(10)
Z.sort()
Z

array([0.19781585, 0.20723088, 0.38850917, 0.49185439, 0.495944  ,
       0.60049319, 0.63838447, 0.84361456, 0.89759288, 0.96122786])

62. 创建一个 3x3 的二维数组，并将列按升序排序：

Z = np.array([[7, 4, 3], [3, 1, 2], [4, 2, 6]])
print("原始数组: \n", Z)

Z.sort(axis=0)
Z

原始数组: 
 [[7 4 3]
 [3 1 2]
 [4 2 6]]





array([[3, 1, 2],
       [4, 2, 3],
       [7, 4, 6]])

创建一个长度为 5 的一维数组，并将其中最大值替换成 0：

Z = np.random.random(5)
print("原数组: ", Z)
Z[Z.argmax()] = 0
Z

原数组:  [0.83637811 0.27544384 0.62426126 0.0401094  0.10982987]





array([0.        , 0.27544384, 0.62426126, 0.0401094 , 0.10982987])

打印每个 NumPy 标量类型的最小值和最大值：

for dtype in [np.int8, np.int32, np.int64]:
    print("The minimum value of {}: ".format(dtype), np.iinfo(dtype).min)
    print("The maximum value of {}: ".format(dtype), np.iinfo(dtype).max)
for dtype in [np.float32, np.float64]:
    print("The minimum value of {}: ".format(dtype), np.finfo(dtype).min)
    print("The maximum value of {}: ".format(dtype), np.finfo(dtype).max)

The minimum value of <class 'numpy.int8'>:  -128
The maximum value of <class 'numpy.int8'>:  127
The minimum value of <class 'numpy.int32'>:  -2147483648
The maximum value of <class 'numpy.int32'>:  2147483647
The minimum value of <class 'numpy.int64'>:  -9223372036854775808
The maximum value of <class 'numpy.int64'>:  9223372036854775807
The minimum value of <class 'numpy.float32'>:  -3.4028235e+38
The maximum value of <class 'numpy.float32'>:  3.4028235e+38
The minimum value of <class 'numpy.float64'>:  -1.7976931348623157e+308
The maximum value of <class 'numpy.float64'>:  1.7976931348623157e+308

将 float32 转换为整型：

Z = np.arange(10, dtype=np.float32)
print(Z)

Z = Z.astype(np.int32, copy=False)
Z

[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]





array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

将随机二维数组按照第 3 列从上到下进行升序排列：

Z = np.random.randint(0, 10, (5, 5))
print("排序前：\n", Z)

Z[Z[:, 2].argsort()]

排序前：
 [[0 2 8 4 1]
 [6 8 7 9 3]
 [4 3 8 7 4]
 [2 4 7 8 2]
 [3 1 1 0 6]]





array([[3, 1, 1, 0, 6],
       [6, 8, 7, 9, 3],
       [2, 4, 7, 8, 2],
       [0, 2, 8, 4, 1],
       [4, 3, 8, 7, 4]])

从随机一维数组中找出距离给定数值（0.5）最近的数：

Z = np.random.uniform(0, 1, 20)
print("随机数组: \n", Z)
z = 0.5
m = Z.flat[np.abs(Z - z).argmin()]

m

随机数组: 
 [0.20844194 0.332088   0.37958877 0.89728413 0.14019974 0.82335814
 0.97087436 0.3106106  0.16288733 0.61210389 0.10656327 0.20278394
 0.43666229 0.31766263 0.20405676 0.18477094 0.2832144  0.50820196
 0.47113967 0.16849455]





0.5082019613480165

将二维数组的前两行进行顺序交换：

A = np.arange(25).reshape(5, 5)
print(A)
A[[0, 1]] = A[[1, 0]]
print(A)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]
[[ 5  6  7  8  9]
 [ 0  1  2  3  4]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]

找出随机一维数组中出现频率最高的值：

Z = np.random.randint(0, 10, 50)
print("随机一维数组:", Z)
np.bincount(Z).argmax()

随机一维数组: [6 5 7 0 7 9 8 9 5 9 6 3 1 8 2 9 7 5 1 4 2 7 2 9 7 3 6 3 8 7 8 1 1 8 6 2 5
 8 9 8 5 6 2 7 8 8 2 8 0 8]





8

找出给定一维数组中非 0 元素的位置索引：

Z = np.nonzero([1, 0, 2, 0, 1, 0, 4, 0])
Z

(array([0, 2, 4, 6], dtype=int64),)

对于给定的 5x5 二维数组，在其内部随机放置 p 个值为 1 的数：

p = 3

Z = np.zeros((5, 5))
np.put(Z, np.random.choice(range(5*5), p, replace=False), 1)

Z

array([[0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.]])

对于随机的 3x3 二维数组，减去数组每一行的平均值：

X = np.random.rand(3, 3)
print(X)

Y = X - X.mean(axis=1, keepdims=True)
Y

[[0.05775368 0.17130683 0.77915029]
 [0.25564292 0.87603931 0.85538035]
 [0.10478851 0.46588923 0.16897846]]





array([[-0.27831658, -0.16476344,  0.44308003],
       [-0.40671127,  0.21368512,  0.19302615],
       [-0.14176356,  0.21933717, -0.07757361]])

获得二维数组点积结果的对角线数组：

A = np.random.uniform(0, 1, (3, 3))
B = np.random.uniform(0, 1, (3, 3))

print(np.dot(A, B))
# 较慢的方法
np.diag(np.dot(A, B))

[[0.15342262 0.12922506 0.19947302]
 [0.93280059 0.5607437  0.77975896]
 [0.64052191 0.87145075 1.17216505]]





array([0.15342262, 0.5607437 , 1.17216505])

np.sum(A * B.T, axis=1)  # 较快的方法

array([0.15342262, 0.5607437 , 1.17216505])

np.einsum("ij, ji->i", A, B)  # 更快的方法

array([0.15342262, 0.5607437 , 1.17216505])

找到随机一维数组中前 p 个最大值：

Z = np.random.randint(1, 100, 100)
print(Z)

p = 5

Z[np.argsort(Z)[-p:]]

[77 48 19 43 13 47  3 60 87 87 27 57 57 76 26 55 26 99 65  3 75 60 58 77
 13 46  8 20 81 32 78 35 84 57 68 26 61 18 46 78 13 12 27 64 27 86 33 14
 66 83 18 56 92 22 35 47 96 61 67 69 89 54 29 72 19 90 23 88 93 54 97 57
 90 69 12 69 76 84 50 94 98 67 66 77 85 38  7 94 42 20 38 58 20 94 46  3
 71 53 44 21]





array([94, 96, 97, 98, 99])

计算随机一维数组中每个元素的 4 次方数值：

x = np.random.randint(2, 5, 5)
print(x)

np.power(x, 4)

[3 3 3 2 3]





array([81, 81, 81, 16, 81], dtype=int32)

对于二维随机数组中各元素，保留其 2 位小数：

Z = np.random.random((5, 5))
print(Z)

np.set_printoptions(precision=2)
Z

[[0.73142487 0.38047992 0.68705076 0.49437273 0.75389337]
 [0.6410366  0.90237558 0.20510673 0.7366409  0.18891817]
 [0.26299746 0.10210177 0.30941782 0.03153999 0.08022621]
 [0.11337767 0.08413264 0.25741677 0.92709367 0.73172809]
 [0.95934109 0.11082325 0.61930737 0.30736667 0.06695922]]





array([[0.73, 0.38, 0.69, 0.49, 0.75],
       [0.64, 0.9 , 0.21, 0.74, 0.19],
       [0.26, 0.1 , 0.31, 0.03, 0.08],
       [0.11, 0.08, 0.26, 0.93, 0.73],
       [0.96, 0.11, 0.62, 0.31, 0.07]])

使用科学记数法输出 NumPy 数组：

Z = np.random.random([5, 5])
print(Z)

Z/1e3

[[0.34 0.15 0.8  0.82 0.45]
 [0.08 0.58 0.58 0.17 0.83]
 [0.16 0.32 0.25 0.14 0.77]
 [0.85 0.29 0.03 0.65 0.96]
 [0.93 0.85 0.22 0.75 0.79]]





array([[3.37e-04, 1.48e-04, 8.03e-04, 8.24e-04, 4.53e-04],
       [8.18e-05, 5.77e-04, 5.83e-04, 1.73e-04, 8.27e-04],
       [1.61e-04, 3.23e-04, 2.54e-04, 1.45e-04, 7.67e-04],
       [8.55e-04, 2.88e-04, 2.86e-05, 6.48e-04, 9.61e-04],
       [9.30e-04, 8.53e-04, 2.20e-04, 7.46e-04, 7.88e-04]])

使用 NumPy 找出百分位数（25%，50%，75%）：

a = np.arange(15)
print(a)

np.percentile(a, q=[25, 50, 75])

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]





array([ 3.5,  7. , 10.5])

找出数组中缺失值的总数及所在位置：

# 生成含缺失值的 2 维数组
Z = np.random.rand(10, 10)
Z[np.random.randint(10, size=5), np.random.randint(10, size=5)] = np.nan
Z

array([[0.27, 0.09, 0.83, 0.13,  nan, 0.04, 0.04, 0.47, 0.33, 0.67],
       [0.93, 0.26, 0.29, 0.94, 0.15, 0.1 , 0.96, 0.28, 0.56, 0.19],
       [1.  , 0.27, 0.79, 0.14, 0.93, 0.43, 0.74, 0.93, 0.55, 0.28],
       [0.42, 0.07, 0.79, 0.89, 0.53, 0.23, 0.2 , 0.11, 0.07, 0.98],
       [0.7 , 0.46,  nan, 0.71, 0.83,  nan, 0.12, 0.93, 0.99, 0.63],
       [0.78, 0.92, 0.55, 0.99, 0.68,  nan, 0.92, 0.89, 0.65, 0.24],
       [0.86, 0.73, 0.84, 0.14, 0.88, 0.61, 0.46, 0.67, 0.52, 0.04],
       [0.03, 0.17, 0.84, 0.05, 0.27, 0.34, 0.73, 0.98, 0.57, 0.25],
       [0.11, 0.81, 0.19, 0.57, 0.19, 0.84, 0.05, 0.51, 0.87, 0.64],
       [0.71, 0.31, 0.43, 0.4 , 0.79, 0.33, 0.83, 0.28, 0.79, 0.78]])

print("缺失值总数: \n", np.isnan(Z).sum())
print("缺失值索引: \n", np.where(np.isnan(Z)))

缺失值总数: 
 4
缺失值索引: 
 (array([0, 4, 4, 5], dtype=int64), array([4, 2, 5, 5], dtype=int64))

从随机数组中删除包含缺失值的行：

# 沿用 79 题中的含缺失值的 2 维数组
Z[np.sum(np.isnan(Z), axis=1) == 0]

array([[0.93, 0.26, 0.29, 0.94, 0.15, 0.1 , 0.96, 0.28, 0.56, 0.19],
       [1.  , 0.27, 0.79, 0.14, 0.93, 0.43, 0.74, 0.93, 0.55, 0.28],
       [0.42, 0.07, 0.79, 0.89, 0.53, 0.23, 0.2 , 0.11, 0.07, 0.98],
       [0.86, 0.73, 0.84, 0.14, 0.88, 0.61, 0.46, 0.67, 0.52, 0.04],
       [0.03, 0.17, 0.84, 0.05, 0.27, 0.34, 0.73, 0.98, 0.57, 0.25],
       [0.11, 0.81, 0.19, 0.57, 0.19, 0.84, 0.05, 0.51, 0.87, 0.64],
       [0.71, 0.31, 0.43, 0.4 , 0.79, 0.33, 0.83, 0.28, 0.79, 0.78]])

统计随机数组中的各元素的数量：

Z = np.random.randint(0, 100, 25).reshape(5, 5)
print(Z)
np.unique(Z, return_counts=True)  # 返回值中，第 2 个数组对应第 1 个数组元素的数量

[[ 0 11 84 68 56]
 [84 37 61 49 71]
 [97 20  8 68 97]
 [72 17 97 49 84]
 [42 64  3 33  9]]





(array([ 0,  3,  8,  9, 11, 17, 20, 33, 37, 42, 49, 56, 61, 64, 68, 71, 72,
        84, 97]),
 array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 3, 3],
       dtype=int64))

将数组中各元素按指定分类转换为文本值：

# 指定类别如下
# 1 → 汽车
# 2 → 公交车
# 3 → 火车

Z = np.random.randint(1, 4, 10)
print(Z)

label_map = {1: "汽车", 2: "公交车", 3: "火车"}

[label_map[x] for x in Z]

[1 1 1 2 2 3 2 3 2 1]





['汽车', '汽车', '汽车', '公交车', '公交车', '火车', '公交车', '火车', '公交车', '汽车']

将多个 1 维数组拼合为单个 Ndarray：

Z1 = np.arange(3)
Z2 = np.arange(3, 7)
Z3 = np.arange(7, 10)

Z = np.array([Z1, Z2, Z3])
print(Z)

np.concatenate(Z)

[array([0, 1, 2]) array([3, 4, 5, 6]) array([7, 8, 9])]





array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

打印各元素在数组中升序排列的索引：

a = np.random.randint(100, size=10)
print('Array: ', a)

a.argsort()

Array:  [55  7 16 62 17 52  5 29 45 16]





array([6, 1, 2, 9, 4, 7, 8, 5, 0, 3], dtype=int64)

得到二维随机数组各行的最大值：

Z = np.random.randint(1, 100, [5, 5])
print(Z)

np.amax(Z, axis=1)

[[63 81 33 79 84]
 [31 30 39 55 62]
 [39 17 37 60 37]
 [91 45 59 23 93]
 [29  5 16 56  4]]





array([84, 62, 60, 93, 56])

得到二维随机数组各行的最小值（区别上面的方法）：

Z = np.random.randint(1, 100, [5, 5])
print(Z)

np.apply_along_axis(np.min, arr=Z, axis=1)

[[26 46 64 95 21]
 [55  1 80 58 83]
 [44 68  5 61 74]
 [32 64 14 48 60]
 [72 19  8 44 51]]





array([21,  1,  5, 14,  8])

计算两个数组之间的欧氏距离：

a = np.array([1, 2])
b = np.array([7, 8])

# 数学计算方法
print(np.sqrt(np.power((8-2), 2) + np.power((7-1), 2)))
# NumPy 计算
np.linalg.norm(b-a)

8.48528137423857





8.48528137423857

打印复数的实部和虚部：

a = np.array([1 + 2j, 3 + 4j, 5 + 6j])

print("实部：", a.real)
print("虚部：", a.imag)

实部： [1. 3. 5.]
虚部： [2. 4. 6.]

求解给出矩阵的逆矩阵并验证：

matrix = np.array([[1., 2.], [3., 4.]])
inverse_matrix = np.linalg.inv(matrix)

# 验证原矩阵和逆矩阵的点积是否为单位矩阵
assert np.allclose(np.dot(matrix, inverse_matrix), np.eye(2))
inverse_matrix

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

使用 Z-Score 标准化算法对数据进行标准化处理：

Z-Score 标准化公式：

$\frac{X-\mathrm{mean}(X)}{\mathrm{sd}(X)}$

# 根据公式定义函数
def zscore(x, axis=None):
    xmean = x.mean(axis=axis, keepdims=True)
    xstd = np.std(x, axis=axis, keepdims=True)
    zscore = (x-xmean)/xstd
    return zscore


# 生成随机数据
Z = np.random.randint(10, size=(5, 5))
print(Z)

zscore(Z)

[[7 0 5 9 0]
 [1 7 4 0 1]
 [7 6 8 4 1]
 [9 4 9 5 6]
 [5 0 4 9 9]]





array([[ 0.69, -1.51,  0.06,  1.32, -1.51],
       [-1.19,  0.69, -0.25, -1.51, -1.19],
       [ 0.69,  0.38,  1.  , -0.25, -1.19],
       [ 1.32, -0.25,  1.32,  0.06,  0.38],
       [ 0.06, -1.51, -0.25,  1.32,  1.32]])

使用 Min-Max 标准化算法对数据进行标准化处理：

$\frac{Z-\min(Z)}{\max(Z)-\min(Z)}$

# 根据公式定义函数
def min_max(x, axis=None):
    min = x.min(axis=axis, keepdims=True)
    max = x.max(axis=axis, keepdims=True)
    result = (x-min)/(max-min)
    return result


# 生成随机数据
Z = np.random.randint(10, size=(5, 5))
print(Z)

min_max(Z)

[[9 4 1 1 3]
 [9 1 8 8 4]
 [0 5 0 1 0]
 [0 5 5 5 4]
 [1 4 8 4 3]]





array([[1.  , 0.44, 0.11, 0.11, 0.33],
       [1.  , 0.11, 0.89, 0.89, 0.44],
       [0.  , 0.56, 0.  , 0.11, 0.  ],
       [0.  , 0.56, 0.56, 0.56, 0.44],
       [0.11, 0.44, 0.89, 0.44, 0.33]])

使用 L2 范数对数据进行标准化处理：

$L_2 = \sqrt{x_1^2 + x_2^2 + \ldots + x_i^2}$

# 根据公式定义函数
def l2_normalize(v, axis=-1, order=2):
    l2 = np.linalg.norm(v, ord=order, axis=axis, keepdims=True)
    l2[l2 == 0] = 1
    return v/l2

# 生成随机数据
Z = np.random.randint(10, size=(5, 5))
print(Z)

l2_normalize(Z)

[[6 5 1 0 1]
 [7 0 5 0 4]
 [2 2 9 0 8]
 [7 7 5 9 2]
 [6 2 7 6 3]]





array([[0.76, 0.63, 0.13, 0.  , 0.13],
       [0.74, 0.  , 0.53, 0.  , 0.42],
       [0.16, 0.16, 0.73, 0.  , 0.65],
       [0.49, 0.49, 0.35, 0.62, 0.14],
       [0.52, 0.17, 0.6 , 0.52, 0.26]])

使用 NumPy 计算变量直接的相关性系数：

Z = np.array([
    [1, 2, 1, 9, 10, 3, 2, 6, 7],  # 特征 A
    [2, 1, 8, 3, 7, 5, 10, 7, 2],  # 特征 B
    [2, 1, 1, 8, 9, 4, 3, 5, 7]])  # 特征 C

np.corrcoef(Z)

array([[ 1.  , -0.06,  0.97],
       [-0.06,  1.  , -0.01],
       [ 0.97, -0.01,  1.  ]])

相关性系数取值从 [−1,1] 变换，靠近 1 则代表正相关性较强， −1 则代表负相关性较强。结果如下所示，变量 A 与变量 A 直接的相关性系数为 1，因为是同一个变量。变量 A 与变量 C 之间的相关性系数为 0.97，说明相关性较强。

使用 NumPy 计算矩阵的特征值和特征向量：

M = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
w, v = np.linalg.eig(M)
# w 对应特征值，v 对应特征向量
w, v

(array([ 1.61e+01, -1.12e+00, -9.76e-16]), matrix([[-0.23, -0.79,  0.41],
         [-0.53, -0.09, -0.82],
         [-0.82,  0.61,  0.41]]))

我们可以通过 𝑃′𝐴𝑃=𝑀 公式反算，验证是否能得到原矩阵。

v * np.diag(w) * np.linalg.inv(v)

matrix([[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]])

使用 NumPy 计算 Ndarray 两相邻元素差值：

Z = np.random.randint(1, 10, 10)
print(Z)

# 计算 Z 两相邻元素差值
print(np.diff(Z, n=1))
# 重复计算 2 次
print(np.diff(Z, n=2))
# 重复计算 3 次
print(np.diff(Z, n=3))

[8 2 1 1 5 5 8 4 8 9]
[-6 -1  0  4  0  3 -4  4  1]
[ 5  1  4 -4  3 -7  8 -3]
[ -4   3  -8   7 -10  15 -11]

使用 NumPy 将 Ndarray 相邻元素依次累加：

Z = np.random.randint(1, 10, 10)
print(Z)

"""
[第一个元素, 第一个元素 + 第二个元素, 第一个元素 + 第二个元素 + 第三个元素, ...]
"""
np.cumsum(Z)

[7 3 9 5 9 1 4 1 3 2]





array([ 7, 10, 19, 24, 33, 34, 38, 39, 42, 44], dtype=int32)

使用 NumPy 按列连接两个数组：

M1 = np.array([1, 2, 3])
M2 = np.array([4, 5, 6])

np.c_[M1, M2]

array([[1, 4],
       [2, 5],
       [3, 6]])

使用 NumPy 按行连接两个数组：

M1 = np.array([1, 2, 3])
M2 = np.array([4, 5, 6])

np.r_[M1, M2]

array([1, 2, 3, 4, 5, 6])

使用 NumPy 打印九九乘法表：

np.fromfunction(lambda i, j: (i + 1) * (j + 1), (9, 9))

array([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.],
       [ 2.,  4.,  6.,  8., 10., 12., 14., 16., 18.],
       [ 3.,  6.,  9., 12., 15., 18., 21., 24., 27.],
       [ 4.,  8., 12., 16., 20., 24., 28., 32., 36.],
       [ 5., 10., 15., 20., 25., 30., 35., 40., 45.],
       [ 6., 12., 18., 24., 30., 36., 42., 48., 54.],
       [ 7., 14., 21., 28., 35., 42., 49., 56., 63.],
       [ 8., 16., 24., 32., 40., 48., 56., 64., 72.],
       [ 9., 18., 27., 36., 45., 54., 63., 72., 81.]])

使用 NumPy 将实验楼 LOGO 转换为 Ndarray 数组：

from io import BytesIO
from PIL import Image
import PIL
import requests

# 通过链接下载图像
URL = 'https://static.shiyanlou.com/img/logo-black.png'
response = requests.get(URL)

# 将内容读取为图像
I = Image.open(BytesIO(response.content))
# 将图像转换为 Ndarray
shiyanlou = np.asarray(I)
shiyanlou

array([[[255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        ...,
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0]],

       [[255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        ...,
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0]],

       [[255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        ...,
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0]],

       ...,

       [[255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        ...,
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0]],

       [[255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        ...,
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0]],

       [[255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        ...,
        [255, 255, 255,   0],
        [255, 255, 255,   0],
        [255, 255, 255,   0]]], dtype=uint8)

# 将转换后的 Ndarray 重新绘制成图像
from matplotlib import pyplot as plt
%matplotlib inline

plt.imshow(shiyanlou)

咸鱼综合症

关注

0
点赞
踩
8

收藏

觉得还不错? 一键收藏
0
评论
Numpy百题

NumPy 百题大冲关介绍NumPy 是 Python 语言的一个第三方库，其支持大量高维度数组与矩阵运算。此外，NumPy 也针对数组运算提供大量的数学函数。机器学习涉及到大量对数组的变换和运算，NumPy 就成了必不可少的工具之一。NumPy 百题大冲关分为基础篇和进阶篇，每部分各有 50 道练习题。基础部分的练习题在于熟悉 NumPy 常用方法的使用，而进阶部分则侧重于 NumPy 方法的组合应用。import numpy as np通过列表创建二维数组：np.array([(1, 2
复制链接

扫一扫

专栏目录