NumPy(三)：基本操作

最新推荐文章于 2022-03-18 10:41:12 发布

leotongxue1234

最新推荐文章于 2022-03-18 10:41:12 发布

阅读量271

点赞数

分类专栏： NumPy 数据分析文章标签： NumPy

本文链接：https://blog.csdn.net/zeroooorez/article/details/100736570

版权

数据分析同时被 2 个专栏收录

12 篇文章 0 订阅

订阅专栏

NumPy

6 篇文章 0 订阅

订阅专栏

a.等差数列数组

numpy.linspace(start, stop[, num=50[, endpoint=True[, retstep=False[, dtype=None]]]]])
返回在指定范围内的均匀间隔的数字（组成的数组），也即返回一个等差数列

start - 起始点，

stop - 结束点

num - 元素个数，默认为50，

endpoint - 是否包含stop数值，默认为True，包含stop值；若为False，则不包含stop值

retstep - 返回值形式，默认为False，返回等差数列组，若为True，则返回结果(array([`samples`, `step`])),

dtype - 返回结果的数据类型，默认无，若无，则参考输入数据类型

import numpy as np

a = np.linspace(1,10,5,endpoint= True) 
print(a) # [ 1.    3.25  5.5   7.75 10.  ]
b = np.linspace(1,10,5,endpoint= False)
print(b) #[1.  2.8 4.6 6.4 8.2]
c = np.linspace(1,10,5,retstep = False)
print(c) # [ 1.    3.25  5.5   7.75 10.  ]
d = np.linspace(1,10,5,retstep = True)
print(d) # (array([ 1.  ,  3.25,  5.5 ,  7.75, 10.  ]), 2.25)

numpy.arange(start,stop, step, dtype)

np.arange(3)
array([0, 1, 2])

np.arange(1,12,2)
array([ 1,  3,  5,  7,  9, 11])

步长为整数一般用numpy.arrage();
步长不为整数时，可以用linspace()

b.等比数列数组

numpy.logspace(start,stop, num, endpoint, base, dtype)
起始位和终止位代表的是10的幂（默认基数为10），0代表10的0次方，9代表10的9次方

np.logspace(0,0,10)
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

np.logspace(0,9,10)
array([  1.00000000e+00,   1.00000000e+01,   1.00000000e+02,
         1.00000000e+03,   1.00000000e+04,   1.00000000e+05,
         1.00000000e+06,   1.00000000e+07,   1.00000000e+08,
         1.00000000e+09])
         
#改变基数，设置base指定底数
np.logspace(0,9,10,base=2)
array([   1.,    2.,    4.,    8.,   16.,   32.,   64.,  128.,  256.,  512.])

c.生成随机抽样数组

np.random模块

方法	说明
np.random.rand(d0, d1, …, dn)	返回[0.0，1.0)内的一组均匀分布的数
np.random.randn(d0, d1, …, dn)	返回一个样本，具有标准正态分布
np.random.randint(low[, high, size])	返回随机的整数，位于半开区间 [low, high)
np.random.uniform(low=0.0, high=1.0, size=None)	从一个均匀分布[low,high)中随机采样，注意定义域是左闭右开
np.random.normal(loc=0.0, scale=1.0, size=None)	loc：float,scale越大越矮胖，scale越小，越瘦高
np.random.standard_normal(size=None)	返回指定形状的标准正态分布的数组

np.random.rand(2,3)
array([[0.90929803, 0.99886158, 0.33523116],
       [0.07327932, 0.69353838, 0.21490553]])

np.random.randn(2,3)
array([[ 0.74864991,  1.05329894,  0.02006168],
       [ 0.6431396 , -3.16055372, -0.735806  ]])

 np.random.randint(2,10,(2,3)) 
 array([[8, 5, 4],
       [9, 3, 4]])

np.random.uniform(1,10,(2,3))
array([[4.05985597, 7.03859559, 4.11069599],
       [9.00901488, 1.41927401, 5.75741741]])
       
np.random.normal(1,1,(2,3))
array([[-0.857967  , -0.31640625,  0.08087909],
       [ 1.33571146, -0.90449108,  3.05647425]])

np.random.standard_normal(2) 
array([0.54376329, 1.78806168])

d.数组的索引，切片

基本的索引和切片

示例1：

data3 = np.random.normal(0, 1, (3,4))
data3

array([[-0.5529426 ,  0.73653997, -0.05788531, -0.13154318],
       [-0.03079838,  1.33797554, -1.38323937,  2.05711507],
       [ 0.57172354,  1.23990202, -1.04270338,  1.09079945]])

data3[0,0:3] #左闭右开

array([-0.5529426 , 0.73653997, -0.05788531])
示例2：
多维数组中，如果省略了后面的索引，则返回对象会是一个维度低一点的ndrray。

data4 = np.random.randint(2,20,(3,3,4))
print(data4)
print(data4[1,1,1])
print(data4[0])

[[[13  4 11 10]
  [ 5 13  8 14]
  [11  4 11  6]]

 [[ 8 16 13 19]
  [ 9  6 18  9]
  [ 7  5  4 19]]

 [[ 3 18  4 15]
  [17  5 13 19]
  [10  7  3 17]]]

6

[[13  4 11 10]
 [ 5 13  8 14]
 [11  4 11  6]]

如果想得到的是ndarray切片的副本，就需要显示地进行复制：data[:].copy()；
切片只能得到数组视图。

data_old = data4[0].copy()
data4[0] = 66
print(data4)
data4[0] = data_old
print(data4)

[[[66 66 66 66]
  [66 66 66 66]
  [66 66 66 66]]

 [[ 8 16 13 19]
  [ 9  6 18  9]
  [ 7  5  4 19]]

 [[ 3 18  4 15]
  [17  5 13 19]
  [10  7  3 17]]]
===============================
[[[13  4 11 10]
  [ 5 13  8 14]
  [11  4 11  6]]

 [[ 8 16 13 19]
  [ 9  6 18  9]
  [ 7  5  4 19]]

 [[ 3 18  4 15]
  [17  5 13 19]
  [10  7  3 17]]]

布尔型索引

data1 = np.array(['a','b','a','c','b','d'])
data2 = data2 = np.arange(24).reshape(6,4)
print(data2)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]
print(data1 == 'a')
[ True False  True False False False]
print(data2[data1 == 'a'])  #data1元素树应该和data2维数相等
[[ 0  1  2  3]
 [ 8  9 10 11]]
print(data2[data1 == 'a',2:])
[[ 2  3]
 [10 11]]

ararry的数据为float型时可以通过布尔值数组设置值

data2 = np.random.randn(6,4)
data2[data2 < 0] = 10
data2 
array([[10.        , 10.        , 10.        , 10.        ],
       [10.        , 10.        ,  0.09817786,  0.14597538],
       [ 1.16733605,  0.37248457,  0.36330872, 10.        ],
       [10.        ,  0.48706849,  0.03056452, 10.        ],
       [ 0.88857398, 10.        , 10.        , 10.        ],
       [ 1.1426662 , 10.        , 10.        , 10.        ]])

花式索引（Fancy indexing）

花式索引就是利用整数数组进行索引

arr = np.empty((8,4))
for i in range(8):
    arr[i] = i
arr
array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.]])

顺序选行子集，传入整数列表或者ndarray


arr[[4,3,0,6]]
array([[4., 4., 4., 4.],
       [3., 3., 3., 3.],
       [0., 0., 0., 0.],
       [6., 6., 6., 6.]])

使用负数从尾部选取行

arr[[-3,-5,-7]]
array([[5., 5., 5., 5.],
       [3., 3., 3., 3.],
       [1., 1., 1., 1.]])

一次传入多个索引数组

data2 = np.arange(32).reshape((8,4))
print(arr)
print(arr[[1,5,7,2],[0,3,1,2]])#取值（1，0），（5，3），（7，1），（2，2）

[[0. 0. 0. 0.]
 [1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [3. 3. 3. 3.]
 [4. 4. 4. 4.]
 [5. 5. 5. 5.]
 [6. 6. 6. 6.]
 [7. 7. 7. 7.]]
 
[1. 5. 7. 2.]

e.修改数组的形状

ndarray.reshape(shape[, order])
ndarray.T:数组的转置
ndarray.resize(new_shape[, refcheck])
ndarray.transpose((ax1,ax2，ax3))：由轴编号组成的元组进行轴进行转置
ndarray.swapaxes(ax1,ax2)：将数组n个维度中两个维度进行调换，不改变原数组
ndarray.flatten() ：对数组进行降维，返回折叠后的一维数组，原数组不变
示例1：

a = np.arange(20)
print(a.reshape([4,5]))
print(a)
print(a.resize([4,5])) #修改原数组，没有返回值
print(a)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

None
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

示例2：

 a.swapaxes(1,0)

array([[ 0,  5, 10, 15],
       [ 1,  6, 11, 16],
       [ 2,  7, 12, 17],
       [ 3,  8, 13, 18],
       [ 4,  9, 14, 19]])

示例3：

a.flatten()

array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
示例4：三维轴变换

data2 = np.arange(24).reshape((2,3,4))
data2

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

data2.transpose((1,0,2))
array([[[ 0,  1,  2,  3],
        [12, 13, 14, 15]],

       [[ 4,  5,  6,  7],
        [16, 17, 18, 19]],

       [[ 8,  9, 10, 11],
        [20, 21, 22, 23]]])

坐标从(0,1,2)到(1,0,2)，即0，1
轴互换，只需要把对应坐标进行互换

		[[[ 0,0,1  0,0,1,  0,02,  0,0,3],
        [ 0,1,0,  0,1,1,  0,1,2,  0,1,3],
        [ 0,2,0,  0,2,1, 0,2,1, 0,2,2]],

         ....]]]
互换后：
		[[[ 0,0,1  0,0,1,  0,02,  0,0,3],
		...],
        
        [[0,1,0,  0,1,1,  0,1,2,  0,1,3],
        ...],
		
		[[ 0,2,0,  0,2,1, 0,2,1, 0,2,2],
		...]]

f.数组去重

ndarray.unique()

a = np.array([[1, 2, 3, 4],[3, 4, 5, 6]])
np.unique(a)

array([1, 2, 3, 4, 5, 6])