004numpy常用操作

最新推荐文章于 2024-07-26 17:36:26 发布

溪山客

最新推荐文章于 2024-07-26 17:36:26 发布

阅读量769

点赞数 21

分类专栏： python常用操作记录文章标签： numpy python

本文链接：https://blog.csdn.net/IWYH123/article/details/138503029

版权

python常用操作记录专栏收录该内容

5 篇文章 0 订阅

订阅专栏

numpy基本用法

numpy数组对象

Numpy 中的多维数组称为 ndarray，这是 Numpy 中最常见的数组对象。ndarray 对象通常包含两个部分：
• ndarray 数据本身
• 描述数据的元数据 Numpy
数组的优势
• Numpy 数组通常是由相同种类的元素组成的，即数组中的数据项的类型一致。这样有一个好处，由于知道数组元素的类型相同，所以能快速确定存储数据所需空间的大小。
• Numpy 数组能够运用向量化运算来处理整个数组，速度较快；而 Python 的列表则通常需要借助循环语句遍历列表，运行效率相对来说要差。
• Numpy 使用了优化过的 C API，运算速度较快

创建ndarray数组

首先需要导入 numpy 库，在导入 numpy 库时通常使用 “np” 作为简写，这也是 Numpy 官方倡导的写法。当然，你也可以选择其他简写的方式或者直接写 numpy，但还是建议用 “np”，这样你的程序能和大都数人的程序保持一致。

import numpy as np

# 一维数组

#基于list
arr1 = np.array([1,2,3,4])
print(arr1)
#结果  [1 2 3 4]

#基于tuple
arr_tuple = np.array((1,2,3,4))
print(arr_tuple)
#结果  [1 2 3 4]

#二维数组（2*3）
arr2 = np.array([[1,2,4],[3,4,5]])
print(arr2)
#结果
array([[1, 2, 4],
       [3, 4, 5]])

#基于np.arange

#一维数组
arr1 = np.arange(5)
print(arr1)
#结果
[0 1 2 3 4]

#二维数组
arr2 = np.array([np.arange(3),np.arange(3)])
print(arr2)
#结果
array([[0, 1, 2],
       [0, 1, 2]])

#基于arange以及reshape创建多维数组

#创建三维数组
arr = np.arange(24).reshape(2,3,4)
print(arr)
#结果
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

numpy数值类型

print(np.int8(12.334))  # 12
print(np.float64(12)  # 12.0
print(np.float(True)  # 1.0
print(bool(1))  # True

a = np.arange(5,dtype=float)
print(a)
#结果
[0., 1., 2., 3., 4.]

ndarray数组的属性

print(np.arange(4,dtype=float))
#结果  [0., 1., 2., 3.]

# 'D'表示复数类型
print(np.arange(4,dtype='D'))
#结果  [0.+0.j, 1.+0.j, 2.+0.j, 3.+0.j]

np.array([1.22,3.45,6.779],dtype='int8')
#结果  array([1, 3, 6], dtype=int8)

#ndim属性，数组维度的数量
a = np.array([[1,2,3],[7,8,9]])
print(a.ndim)
#结果 2

# shape 属性，数组对象的尺度，对于矩阵，即 n 行 m 列,shape 是一个元组（tuple）
print(a.shape)
#结果  （2，3）


#size 属性用来保存元素的数量，相当于 shape 中 nXm 的值
print(a.size)
#结果 6

# T 属性，数组转置
b = np.arange(24).reshape(4,6)
b
# 结果
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])

b.T
# 结果
array([[ 0,  6, 12, 18],
       [ 1,  7, 13, 19],
       [ 2,  8, 14, 20],
       [ 3,  9, 15, 21],
       [ 4, 10, 16, 22],
       [ 5, 11, 17, 23]])

ndarray数组的切片和索引

# 一维数组的切片和索引与 python 的 list 索引类似。

a = np.arange(7)
print(a)
#结果 [0, 1, 2, 3, 4, 5, 6]


print(a[1:4])
# 结果  [1, 2, 3]


#每间隔2个取一个数
a[:6:2]
#结果  array([0, 2, 4])


a[::2]
#结果  array([0, 2, 4, 6])

#二维数组的切片和索引

b = np.arange(12).reshape(3,4)
b
#结果
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])


b[0:3,0:2]
#结果 
array([[0, 1],
       [4, 5],
       [8, 9]])

处理数组形状

形状转换

reshape()

b
#结果 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])


b.reshape(4,3)
#结果
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

resize()

b
#结果
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

b.resize(4,3)
b
#结果
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

函数 resize（）的作用跟 reshape（）类似，但是会改变所作用的数组

• ravel() 和 flatten()，将多维数组转换成一维数组，如下：

b.ravel()
#结果
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

b.flatten()
#结果
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

b.shape = (2,6)
b
#结果
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

• 转置前面描述了数组转置的属性（T），也可以通过 transpose() 函数来实现

b.transpose()
#结果
array([[ 0,  6],
       [ 1,  7],
       [ 2,  8],
       [ 3,  9],
       [ 4, 10],
       [ 5, 11]])

堆叠数组

b
#结果
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

c = b*2
c
#结果
array([[ 0,  2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20, 22]])

水平叠加hstack()

np.hstack((b,c))
#结果
array([[ 0,  1,  2,  3,  4,  5,  0,  2,  4,  6,  8, 10],
       [ 6,  7,  8,  9, 10, 11, 12, 14, 16, 18, 20, 22]])

垂直叠加vstack()

np.vstack((b,c))
#结果
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [ 0,  2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20, 22]])

concatenate()

concatenate() 方法，通过设置 axis 的值来设置叠加方向
axis=1 时，沿水平方向叠加
axis=0 时，沿垂直方向叠加

np.concatenate((b,c),axis=1)
#结果
array([[ 0,  1,  2,  3,  4,  5,  0,  2,  4,  6,  8, 10],
       [ 6,  7,  8,  9, 10, 11, 12, 14, 16, 18, 20, 22]])


np.concatenate((b,c),axis=0)
#结果
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [ 0,  2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20, 22]])

深度叠加dstack()

arr_dstack = np.dstack((b,c))
# print(arr_dstack.shape)  #(2, 6, 2)
arr_dstack
#结果
array([[[ 0,  0],
        [ 1,  2],
        [ 2,  4],
        [ 3,  6],
        [ 4,  8],
        [ 5, 10]],

       [[ 6, 12],
        [ 7, 14],
        [ 8, 16],
        [ 9, 18],
        [10, 20],
        [11, 22]]])

数组的拆分

跟数组的叠加类似，数组的拆分可以分为横向拆分、纵向拆分以及深度拆分。涉及的函数为 hsplit()、vsplit()、 dsplit() 以及 split()

b
#结果
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

• 沿横向轴拆分（axis=1）
np.hsplit(b,2)
#结果
[array([[0, 1, 2],
        [6, 7, 8]]),
 array([[ 3,  4,  5],
        [ 9, 10, 11]])]

• 沿纵向轴拆分（axis=0）
np.vsplit(b,2)
#结果
[array([[0, 1, 2, 3, 4, 5]]), array([[ 6,  7,  8,  9, 10, 11]])]

• 深度拆分
arr_dstack
#结果
array([[[ 0,  0],
        [ 1,  2],
        [ 2,  4],
        [ 3,  6],
        [ 4,  8],
        [ 5, 10]],

       [[ 6, 12],
        [ 7, 14],
        [ 8, 16],
        [ 9, 18],
        [10, 20],
        [11, 22]]])


np.dsplit(arr_dstack,2)
#结果
[array([[[ 0],
         [ 1],
         [ 2],
         [ 3],
         [ 4],
         [ 5]],
 
        [[ 6],
         [ 7],
         [ 8],
         [ 9],
         [10],
         [11]]]),
 array([[[ 0],
         [ 2],
         [ 4],
         [ 6],
         [ 8],
         [10]],
 
        [[12],
         [14],
         [16],
         [18],
         [20],
         [22]]])]

数组的类型转换

• 数组转换成 list，使用 tolist()
b = np.array([[0,1,20,3,4,5],
             [6,7,8,9,10,11]])
            
b
#结果
array([[ 0,  1, 20,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

b.tolist()
#结果
[[0, 1, 20, 3, 4, 5], [6, 7, 8, 9, 10, 11]]

• 转换成指定类型，astype() 函数
b.astype(float)
#结果
array([[ 0.,  1., 20.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10., 11.]])

numpy常用的统计函数

常用的函数如下, 请注意函数在使用时需要指定 axis 轴的方向，若不指定，默认统计整个数组
• np.sum()，返回求和
• np.mean()，返回均值
• np.max()，返回最大值
• np.min()，返回最小值
• np.ptp()，数组沿指定轴返回最大值减去最小值，即（max-min）
• np.std()，返回标准偏差（standard deviation）
• np.var()，返回方差（variance）
• np.cumsum()，返回累加值
• np.cumprod()，返回累乘积值

b
#结果
array([[ 0,  1, 20,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

np.max(b)
#结果
20


#沿axis=1轴方向统计
np.max(b,axis=1)
#结果
array([20, 11])


#沿axis=0轴方向统计
np.max(b,axis=0)
#结果
array([ 6,  7, 20,  9, 10, 11])


np.min(b)
#结果
0


• np.ptp()，返回整个数组的最大值减去最小值，如下：
np.ptp(b)
#结果
20


np.ptp(b,axis=0)
#结果
array([ 6,  6, 12,  6,  6,  6])


np.ptp(b,axis=1)
#结果
array([20, 5])

• np.cumsum()，沿指定轴方向进行累加

b
#结果
array([[ 0,  1, 20,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])



b.resize(4,3)
b
#结果
array([[ 0,  1, 20],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])


np.cumsum(b,axis=0)
#结果
array([[ 0,  1, 20],
       [ 3,  5, 25],
       [ 9, 12, 33],
       [18, 22, 44]], dtype=int32)


np.cumsum(b,axis=1)
#结果
array([[ 0,  1, 21],
       [ 3,  7, 12],
       [ 6, 13, 21],
       [ 9, 19, 30]], dtype=int32)

• np.cumprod()，沿指定轴方向进行累乘积

np.cumprod(b,axis=1)
#结果
array([[  0,   0,   0],
       [  3,  12,  60],
       [  6,  42, 336],
       [  9,  90, 990]], dtype=int32)


np.cumprod(b,axis=0)
#结果
array([[   0,    1,   20],
       [   0,    4,  100],
       [   0,   28,  800],
       [   0,  280, 8800]], dtype=int32)

数组的广播

当数组跟一个标量进行数学运算时，标量需要根据数组的形状进行扩展，然后执行运算。这个扩展的过程称为 “广播（broadcasting）”

b
#结果
array([[ 0,  1, 20],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])


d = b+2
d
#结果
array([[ 2,  3, 22],
       [ 5,  6,  7],
       [ 8,  9, 10],
       [11, 12, 13]])

numpy的random函数简介

import numpy as np

numpy.random.rand()

numpy.random.rand(d0,d1,…,dn)
• rand 函数根据给定维度生成 [0,1) 之间的数据，包含 0，不包含 1
• dn 表格每个维度
• 返回值为指定维度的 array

np.random.rand(4,2)
#结果
array([[0.09768178, 0.53919884],
       [0.66230757, 0.98809278],
       [0.26719656, 0.69092275],
       [0.59863283, 0.99759447]])


np.random.rand(4,3,2)  #shape:4*3*2
#结果
array([[[0.20348874, 0.6913323 ],
        [0.8406293 , 0.08108662],
        [0.06062108, 0.47553319]],

       [[0.66196658, 0.18232654],
        [0.0323628 , 0.89937112],
        [0.09199035, 0.46579061]],

       [[0.10725848, 0.19048667],
        [0.04728811, 0.07922681],
        [0.31872204, 0.64774   ]],

       [[0.16946176, 0.66986772],
        [0.15047562, 0.83602738],
        [0.04785992, 0.56250968]]])

numpy.random.randn()

numpy.random.randn(d0,d1,…,dn)
• randn 函数返回一个或一组样本，具有标准正态分布。
• dn 表格每个维度
• 返回值为指定维度的 array

np.random.randn()  #没参数时，返回单个数据
#结果
-0.20021668751538962


b = np.random.randn(2,4)
b
#结果
array([[-9.45657675e-01, -1.43117477e+00,  1.24388607e+00,
         2.50059701e-01],
       [-1.05238226e+00, -3.96157660e-02, -1.21050545e-01,
         7.27678675e-04]])

numpy.random.randint()

numpy.random.randint(low, high=None, size=None, dtype=‘l’)
• 返回随机整数，范围区间为 [low,high），包含 low，不包含 high
• 参数：low 为最小值，high 为最大值，size 为数组维度大小，dtype 为数据类型，默认的数据类型是 np.int
• high 没有填写时，默认生成随机数的范围是 [0，low)

np.random.randint(1,size=5)  #返回[0,1）之间的整数，所以只有0
#结果
array([0, 0, 0, 0, 0])


np.random.randint(1,5) #返回1个[1,5)之间的随机整数
#结果
3


np.random.randint(-5,5,size=(2,2))
#结果
array([[-5, -1],
       [-5, -4]])

numpy.random.random_integers

numpy.random.random_integers(low, high=None, size=None)
• 返回随机整数，范围区间为 [low,high]，包含 low 和 high
• 参数：low 为最小值，high 为最大值，size 为数组维度大小
• high 没有填写时，默认生成随机数的范围是 [1，low] 该函数在最新的 numpy 版本中已被替代，建议使用 randint 函数

np.random.random_integers(1,size=5)
#结果
array([1, 1, 1, 1, 1])

生成[0,1)之间的浮点数

• numpy.random.random_sample(size=None)
• numpy.random.random(size=None)
• numpy.random.ranf(size=None)
• numpy.random.sample(size=None)

print("------------random_sample-------------")
print(np.random.random_sample(size=(2,2)))
print("--------random--------")
print(np.random.random(size=(2,2)))


#结果
------------random_sample-------------
[[0.24505128 0.51575767]
 [0.59361235 0.3049443 ]]
--------random--------
[[0.21149953 0.0691251 ]
 [0.05216152 0.08806123]]

numpy.random.choice()

numpy.random.choice(a, size=None, replace=True, p=None)
• 从给定的一维数组中生成随机数
• 参数：a 为一维数组类似数据或整数；size 为数组维度；p 为数组中的数据出现的概率
• a 为整数时，对应的一维数组为 np.arange(a)

np.random.choice(5,3)
#结果
array([2, 1, 2])


#当replace为False时，生成的随机数不能有重复的数值
np.random.choice(5,3,replace=False)
#结果
array([2, 1, 4])

numpy.random.seed()

• np.random.seed() 的作用：使得随机数据可预测
• 当我们设置相同的 seed，每次生成的随机数相同。如果不设置 seed，则每次会生成不同的随机数

np.random.seed(0)
np.random.rand(5)
#结果
array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 ])



np.random.seed(1676)
np.random.rand(5)
#结果
array([0.39983389, 0.29426895, 0.89541728, 0.71807369, 0.3531823 ])