Numpy（自用）

最新推荐文章于 2023-03-26 00:53:54 发布

云低

最新推荐文章于 2023-03-26 00:53:54 发布

阅读量242

点赞数

文章标签： python

本文链接：https://blog.csdn.net/gggfff12345/article/details/120477450

版权

1 列表与数组：

在实现层面，数组基本上包含一个指向连续数据块的指针。另一方面， Python 列表包含一个指向指针块的指针，这其中的每一个指针对应一个完整的 Python 对象（如前面看到的 Python 整型）。另外，列表的优势是灵活，因为每个列表元素是一个包含数据和类型信息的完整结构体，而且列表可以用任意类型的数据填充。固定类型的 NumPy 式数组缺乏这种灵活性，但是能更有效地存储和操作数据

2 array方法创建数组：

In[6]: import array
L = list(range(10))
A = array.array('i', L)
A
Out[6]: array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

3 从Python列表创建数组：

In[9]: np.array([3.14, 4, 2, 3])
Out[9]: array([ 3.14, 4. , 2. , 3. ])


In[10]: np.array([1, 2, 3, 4], dtype='float32')
Out[10]: array([ 1., 2., 3., 4.], dtype=float32)

4 从头创建数组：

In[14]: # 创建一个3×5的浮点型数组，数组的值都是3.14
np.full((3, 5), 3.14)
Out[14]: array([[ 3.14, 3.14, 3.14, 3.14, 3.14],
[ 3.14, 3.14, 3.14, 3.14, 3.14],
[ 3.14, 3.14, 3.14, 3.14, 3.14]])


In[15]: # 创建一个3×5的浮点型数组，数组的值是一个线性序列
# 从0开始，到20结束，步长为2
# （它和内置的range()函数类似）
np.arange(0, 20, 2)
Out[15]: array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])


In[18]: # 创建一个3×3的、均值为0、方差为1的
# 正态分布的随机数数组
np.random.normal(0, 1, (3, 3))
Out[18]: array([[ 1.51772646, 0.39614948, -0.10634696],
[ 0.25671348, 0.00732722, 0.37783601],
[ 0.68446945, 0.15926039, -0.70744073]])


In[19]: # 创建一个3×3的、[0, 10)区间的随机整型数组
np.random.randint(0, 10, (3, 3))
Out[19]: array([[2, 3, 4],
[5, 7, 8],
[0, 5, 0]])


In[20]: # 创建一个3×3的单位矩阵
np.eye(3)
Out[20]: array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])


In[21]: # 创建一个由3个整型数组成的未初始化的数组
# 数组的值是内存空间中的任意值
np.empty(3)
Out[21]: array([ 1., 1., 1.])

5 Numpy数组的属性：

nidm（数组的维度）、shape（数组每个维度的大小）和 size（数组的总大小）、表示每个数组元素字节大小的 itemsize、总字节大小的属性 nbytes

6 数组切片：获取子数组：（非副本视图）

x[start:stop:step]

7 创建数组的副本：

In[35]: x2_sub_copy = x2[:2, :2].copy()
print(x2_sub_copy)
[[99 5]
[ 7 6]]

8 数组的变形：

In[39]: x = np.array([1, 2, 3])
# 通过变形获得的行向量
x.reshape((1, 3))
Out[39]: array([[1, 2, 3]])
In[40]: # 通过newaxis获得的行向量
x[np.newaxis, :]
Out[40]: array([[1, 2, 3]])

9 数组的拼接：

In[44]: z = [99, 99, 99]
print(np.concatenate([x, y, z]))
[ 1 2 3 3 2 1 99 99 99]


In[45]: grid = np.array([[1, 2, 3],
[4, 5, 6]])
In[46]: # 沿着第一个轴拼接
np.concatenate([grid, grid])
Out[46]: array([[1, 2, 3],
[4, 5, 6],
[1, 2, 3],
[4, 5, 6]])
In[47]: # 沿着第二个轴拼接（从0开始索引）
np.concatenate([grid, grid], axis=1)
Out[47]: array([[1, 2, 3, 1, 2, 3],
[4, 5, 6, 4, 5, 6]])


In[48]: x = np.array([1, 2, 3])
grid = np.array([[9, 8, 7],
[6, 5, 4]])
# 垂直栈数组
np.vstack([x, grid])
Out[48]: array([[1, 2, 3],
[9, 8, 7],
[6, 5, 4]])
In[49]: # 水平栈数组
y = np.array([[99],
[99]])
np.hstack([grid, y])
Out[49]: array([[ 9, 8, 7, 99],
[ 6, 5, 4, 99]])

10 数组的分裂：

In[50]: x = [1, 2, 3, 99, 99, 3, 2, 1]
x1, x2, x3 = np.split(x, [3, 5])
print(x1, x2, x3)
[1 2 3] [99 99] [3 2 1]


In[51]: grid = np.arange(16).reshape((4, 4))
grid
Out[51]: array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In[52]: upper, lower = np.vsplit(grid, [2])
print(upper)
print(lower)
[[0 1 2 3]
[4 5 6 7]]
[[ 8 9 10 11]
[12 13 14 15]]
In[53]: left, right = np.hsplit(grid, [2])
print(left)
print(right)
[[ 0 1]
[ 4 5]
[ 8 9]
[12 13]]
[[ 2 3]
[ 6 7]
[10 11]
[14 15]]

11 通用函数：

所有这些算术运算符都是 NumPy 内置函数的简单封装器，正如 NumPy 能理解 Python 内置的运算操作，NumPy 也可以理解 Python 内置的绝对值函数。

In[15]: theta = np.linspace(0, np.pi, 3)
In[16]: print("theta = ", theta)
print("sin(theta) = ", np.sin(theta))
print("cos(theta) = ", np.cos(theta))
print("tan(theta) = ", np.tan(theta))
theta = [ 0. 1.57079633 3.14159265]
sin(theta) = [ 0.00000000e+00 1.00000000e+00 1.22464680e-16]
cos(theta) = [ 1.00000000e+00 6.12323400e-17 -1.00000000e+00]
tan(theta) = [ 0.00000000e+00 1.63312394e+16 -1.22464680e-16]


In[21]: from scipy import special
In[22]: # Gamma函数（广义阶乘，generalized factorials）和相关函数
x = [1, 5, 10]
print("gamma(x) =", special.gamma(x))
print("ln|gamma(x)| =", special.gammaln(x))
print("beta(x, 2) =", special.beta(x, 2))
gamma(x) = [ 1.00000000e+00 2.40000000e+01 3.62880000e+05]
ln|gamma(x)| = [ 0. 3.17805383 12.80182748]
beta(x, 2) = [ 0.5 0.03333333 0.00909091]
In[23]: # 误差函数（高斯积分）
# 它的实现和它的逆实现
x = np.array([0, 0.3, 0.7, 1.0])
print("erf(x) =", special.erf(x))
print("erfc(x) =", special.erfc(x))
print("erfinv(x) =", special.erfinv(x))
erf(x) = [ 0. 0.32862676 0.67780119 0.84270079]
erfc(x) = [ 1. 0.67137324 0.32219881 0.15729921]
erfinv(x) = [ 0. 0.27246271 0.73286908 inf]

12 高级的通用函数特性：

In[25]: y = np.zeros(10)
np.power(2, x, out=y[::2])
print(y)
[ 1. 0. 2. 0. 4. 0. 8. 0. 16. 0.]


In[26]: x = np.arange(1, 6)
np.add.reduce(x)
Out[26]: 15


In[27]: np.multiply.reduce(x)
Out[27]: 120


In[28]: np.add.accumulate(x)
Out[28]: array([ 1, 3, 6, 10, 15])
In[29]: np.multiply.accumulate(x)
Out[29]: array([ 1, 2, 6, 24, 120])


In[30]: x = np.arange(1, 6)
np.multiply.outer(x, x)
Out[30]: array([[ 1, 2, 3, 4, 5],
[ 2, 4, 6, 8, 10],
[ 3, 6, 9, 12, 15],
[ 4, 8, 12, 16, 20],
[ 5, 10, 15, 20, 25]])

13 聚合函数：

14 广播及应用：

In[17]: X = np.random.random((10, 3))
In[18]: Xmean = X.mean(0)
Xmean
Out[18]: array([ 0.53514715, 0.66567217, 0.44385899])
In[19]: X_centered = X - Xmean


In[21]: # x和y表示0~5区间50个步长的序列
x = np.linspace(0, 5, 50)
y = np.linspace(0, 5, 50)[:, np.newaxis]
z = np.sin(x) ** 10 + np.cos(10 + y * x) * np.cos(x)

15 比较、掩码和布尔逻辑：

In[12]: rng = np.random.RandomState(0)
x = rng.randint(10, size=(3, 4))
x
Out[12]: array([[5, 0, 3, 3],
[7, 9, 3, 5],
[2, 4, 7, 6]])
In[13]: x < 6
Out[13]: array([[ True, True, True, True],
[False, False, True, True],
[ True, True, False, False]], dtype=bool)


In[14]: print(x)
[[5 0 3 3]
[7 9 3 5]
[2 4 7 6]]
In[15]: # 有多少值小于6？
np.count_nonzero(x < 6)
Out[15]: 8
In[16]: np.sum(x < 6)
Out[16]: 8
In[17]: # 每行有多少值小于6？
np.sum(x < 6, axis=1)
Out[17]: array([4, 2, 2])
In[18]: # 有没有值大于8？
np.any(x > 8)
Out[18]: True
In[19]: # 有没有值小于0？
np.any(x < 0)
Out[19]: False
In[20]: # 是否所有值都小于10？
np.all(x < 10)
Out[20]: True
In[21]: # 是否所有值都等于6？
np.all(x == 6)
Out[21]: False
In[22]: # 是否每行的所有值都小于8？
np.all(x < 8, axis=1)
Out[22]: array([ True, False, True], dtype=bool


In[23]: np.sum((inches > 0.5) & (inches < 1))
Out[23]: 29
inches > (0.5 & inches) < 1
In[24]: np.sum(~( (inches <= 0.5) | (inches >= 1) ))
Out[24]: 29


In[29]:
# 为所有下雨天创建一个掩码
rainy = (inches > 0)
# 构建一个包含整个夏季日期的掩码（6月21日是第172天）
summer = (np.arange(365) - 172 < 90) & (np.arange(365) - 172 > 0)
print("Median precip on rainy days in 2014 (inches): ",
np.median(inches[rainy]))
print("Median precip on summer days in 2014 (inches): ",
np.median(inches[summer]))
print("Maximum precip on summer days in 2014 (inches): ",
np.max(inches[summer]))
print("Median precip on non-summer rainy days (inches):",
np.median(inches[rainy & ~summer]))
Median precip on rainy days in 2014 (inches): 0.194881889764
Median precip on summer days in 2014 (inches): 0.0
Maximum precip on summer days in 2014 (inches): 0.850393700787
Median precip on non-summer rainy days (inches): 0.200787401575

16 花哨的索引：

In[3]: ind = [3, 7, 4]
x[ind]
Out[3]: array([71, 86, 60])


In[4]: ind = np.array([[3, 7],
[4, 5]])
x[ind]
Out[4]: array([[71, 86],
[60, 20]])


In[5]: X = np.arange(12).reshape((3, 4))
X
Out[5]: array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In[6]: row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
In[7]: X[row[:, np.newaxis], col]
Out[7]: array([[ 2, 1, 3],
[ 6, 5, 7],
[10, 9, 11]])

17 组合索引：

In[10]: X[2, [2, 0, 1]]
Out[10]: array([10, 8, 9])


In[11]: X[1:, [2, 0, 1]]
Out[11]: array([[ 6, 4, 5],
[10, 8, 9]])


In[12]: mask = np.array([1, 0, 1, 0], dtype=bool)
X[row[:, np.newaxis], mask]
Out[12]: array([[ 0, 2],
[ 4, 6],
[ 8, 10]])

18 用花哨的索引修改值：

In[18]: x = np.arange(10)
i = np.array([2, 1, 8, 4])
x[i] = 99
print(x)
[ 0 99 99 3 99 5 6 7 99 9]


In[21]: i = [2, 3, 3, 4, 4, 4]
x[i] += 1
x
Out[21]: array([ 6., 0., 1., 1., 1., 0., 0., 0., 0., 0.])


In[22]: x = np.zeros(10)
np.add.at(x, i, 1)
print(x)
[ 0. 0. 1. 2. 3. 0. 0. 0. 0. 0.]

19 数组的排序：

In[5]: x = np.array([2, 1, 4, 3, 5])
np.sort(x)
Out[5]: array([1, 2, 3, 4, 5])
In[6]: x.sort()
print(x)
[1 2 3 4 5]


In[7]: x = np.array([2, 1, 4, 3, 5])
i = np.argsort(x)
print(i)
[1 0 3 2 4]


In[9]: rand = np.random.RandomState(42)
X = rand.randint(0, 10, (4, 6))
print(X)
[[6 3 7 4 6 9]
[2 6 7 4 3 7]
[7 2 5 4 1 7]
[5 1 4 0 9 5]]
In[10]: # 对X的每一列排序
np.sort(X, axis=0)
Out[10]: array([[2, 1, 4, 0, 1, 5],
[5, 2, 5, 4, 3, 7],
[6, 3, 7, 4, 6, 7],
[7, 6, 7, 4, 9, 9]])

20 部分排序：分隔：

In[12]: x = np.array([7, 2, 3, 1, 6, 5, 4])
np.partition(x, 3)
Out[12]: array([2, 1, 3, 4, 6, 5, 7])

21 计算两点距离的一种方法：

dist_sq = np.sum((X[:,np.newaxis,:] - X[np.newaxis,:,:]) ** 2, axis=-1)

22 结构化数据：NumPy的结构化数组：

data = np.zeros(4, dtype={'names':('name', 'age', 'weight'),
'formats':('U10', 'i4', 'f8')})
print(data.dtype)
[('name', '<U10'), ('age', '<i4'), ('weight', '<f8')
In[5]: data['name'] = name
data['age'] = age
data['weight'] = weight
print(data)
[('Alice', 25, 55.0) ('Bob', 45, 85.5) ('Cathy', 37, 68.0)
('Doug', 19, 61.5)]
In[9]: # 获取年龄小于30岁的人的名字
data[data['age'] < 30]['name']
Out[9]: array(['Alice', 'Doug'],
dtype='<U10')


In[14]: tp = np.dtype([('id', 'i8'), ('mat', 'f8', (3, 3))])
X = np.zeros(1, dtype=tp)
print(X[0])
print(X['mat'][0])
(0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]])
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]


In[15]: data['age']
Out[15]: array([25, 45, 37, 19], dtype=int32)
In[16]: data_rec = data.view(np.recarray)
data_rec.age
Out[16]: array([25, 45, 37, 19], dtype=int32)