2021-07-07 Numpy学习

本文详细介绍了Numpy的基础知识,包括创建数组、数据类型、数组运算、广播机制、形状调整、转置、索引切片等。通过实例展示了如何进行数组与数字、数组与数组之间的计算,并探讨了numpy中的特殊值如nan和inf的处理。此外,还讲解了numpy的统计函数、数组拼接和随机数生成等内容,是学习Numpy的全面指南。
摘要由CSDN通过智能技术生成

基础

概述

什么是numpy

一个在Python中做科学计算的基础库,重在数值计算,也是大部分PYTHON科学计算库的基础库,多用于在大型、多维数组上执行数值运算

创建numpy数组

t1 = np.array([1,2,3]) #list
t2 = np.array(range(10)) #range 函数
t3 = np.arange(4,10,2) # arange
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-A42hwHp9-1625668777212)(attachment:image-2.png)]

数组的类名:numpy.ndarray

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-33SBwjtv-1625668777215)(attachment:image-3.png)]

数据的类型

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-le04J926-1625668777218)(attachment:image-2.png)]
数据量大->需要考虑内存占用的问题->uint8
默认是跟着电脑走

import numpy as np

t1 = np.array([1,2,3])
print(t1)
print(type(t1))
[1 2 3]
<class 'numpy.ndarray'>
t2 = np.array(range(10)) #range 函数
print(t2)
[0 1 2 3 4 5 6 7 8 9]
t3 = np.arange(4,10,2) #用法与range函数一致

数据类型

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-h2bdY9hj-1625668777221)(attachment:image.png)]

t4 = np.array(range(10),dtype='uint8')
print(t4)
print(t4.dtype)
[0 1 2 3 4 5 6 7 8 9]
uint8
t5 = np.array([1,0,1,0,1,0,1,1,1],dtype=bool)
print(t5)
print(t5.dtype)
[ True False  True False  True False  True  True  True]
bool
调整数据类型
t6 = t5.astype('uint8')
print(t6)
print(t6.dtype)
[1 0 1 0 1 0 1 1 1]
uint8

numpy小数位数

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-4Cab6PXt-1625668777222)(attachment:image.png)]

import random
t7 = np.array([random.random() for i in range(10)])
print(t7)
print(t7.dtype)
[0.51172081 0.80610675 0.10256956 0.11558658 0.02495668 0.25243878
 0.80026223 0.88803383 0.41383825 0.33532935]
float64
t8 = np.round(t7,2)
print(t8)
print(t8.dtype)
[0.51 0.81 0.1  0.12 0.02 0.25 0.8  0.89 0.41 0.34]
float64

numpy的形状 np.shape

import numpy as np
t1 = np.arange(12)
print(t1)
print(t1.shape)
[ 0  1  2  3  4  5  6  7  8  9 10 11]
(12,)
t2 = np.array([[1,2,3,4],[5,6,7,8]]) # 二维数组
print(t2)
print(t2.shape)
[[1 2 3 4]
 [5 6 7 8]]
(2, 4)
t3 = np.array([[[1,2],[2,3]],[[1,2],[3,4]]]) # 三维数组
print(t3)
print(t3.shape)
[[[1 2]
  [2 3]]

 [[1 2]
  [3 4]]]
(2, 2, 2)

修改数组的形状 reshape

必须符合array本身的元素个数

t4 = np.arange(10)
print(t4.reshape(2,5))
print(t4) # t4本身不会改变
[[0 1 2 3 4]
 [5 6 7 8 9]]
[0 1 2 3 4 5 6 7 8 9]
t5 = np.arange(24)
t5.reshape(2,3,4)
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
t5 = np.arange(12)
t5 = t5.reshape(2,3,2)
print(t5)
# 下面两个是一样的,都是一维的
print(t5.reshape(12))
print(t5.reshape(-1)) # 不清楚具体有多少个的时候
# 下面两个也是一样的,都是二维的
print(t5.reshape(12,1))
print(t5.reshape(-1,1))
[[[ 0  1]
  [ 2  3]
  [ 4  5]]

 [[ 6  7]
  [ 8  9]
  [10 11]]]
[ 0  1  2  3  4  5  6  7  8  9 10 11]
[ 0  1  2  3  4  5  6  7  8  9 10 11]
[[ 0]
 [ 1]
 [ 2]
 [ 3]
 [ 4]
 [ 5]
 [ 6]
 [ 7]
 [ 8]
 [ 9]
 [10]
 [11]]
[[ 0]
 [ 1]
 [ 2]
 [ 3]
 [ 4]
 [ 5]
 [ 6]
 [ 7]
 [ 8]
 [ 9]
 [10]
 [11]]
# 另一个函数 flatten
t5 = np.arange(12)
t5 = t5.reshape(2,3,2)
print(t5)
t5.flatten() # 展开成一维数组
[[[ 0  1]
  [ 2  3]
  [ 4  5]]

 [[ 6  7]
  [ 8  9]
  [10 11]]]





array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

numpy计算

t6 = np.arange(24)
t6 = t6.reshape(4,6)
数组和数字的计算 : 广播机制

numpy的广播机制,数字计算将会广播到所有元素中

t6+2
array([[ 2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25]])
t6*2
array([[ 0,  2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20, 22],
       [24, 26, 28, 30, 32, 34],
       [36, 38, 40, 42, 44, 46]])
t6/2
array([[ 0. ,  0.5,  1. ,  1.5,  2. ,  2.5],
       [ 3. ,  3.5,  4. ,  4.5,  5. ,  5.5],
       [ 6. ,  6.5,  7. ,  7.5,  8. ,  8.5],
       [ 9. ,  9.5, 10. , 10.5, 11. , 11.5]])
t6/0 # 注意除以0报warning nan(not a number) inf
D:\softerware\Anaconda3\envs\pytorch\lib\site-packages\ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in true_divide
  """Entry point for launching an IPython kernel.
D:\softerware\Anaconda3\envs\pytorch\lib\site-packages\ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in true_divide
  """Entry point for launching an IPython kernel.





array([[nan, inf, inf, inf, inf, inf],
       [inf, inf, inf, inf, inf, inf],
       [inf, inf, inf, inf, inf, inf],
       [inf, inf, inf, inf, inf, inf]])
数组和数组计算 形状一样 对应位置计算
t7 = np.arange(24)
t7 = t7.reshape(4,6)
t7
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
t8 = np.arange(100,124)
t8 = t8.reshape(4,6)
t8
array([[100, 101, 102, 103, 104, 105],
       [106, 107, 108, 109, 110, 111],
       [112, 113, 114, 115, 116, 117],
       [118, 119, 120, 121, 122, 123]])
print('t7+t8:\n',t7+t8)
print('t7*t8:\n',t7*t8)
print('t7/t8:\n',t7/t8)
t7+t8:
 [[100 102 104 106 108 110]
 [112 114 116 118 120 122]
 [124 126 128 130 132 134]
 [136 138 140 142 144 146]]
t7*t8:
 [[   0  101  204  309  416  525]
 [ 636  749  864  981 1100 1221]
 [1344 1469 1596 1725 1856 1989]
 [2124 2261 2400 2541 2684 2829]]
t7/t8:
 [[0.         0.00990099 0.01960784 0.02912621 0.03846154 0.04761905]
 [0.05660377 0.06542056 0.07407407 0.08256881 0.09090909 0.0990991 ]
 [0.10714286 0.11504425 0.12280702 0.13043478 0.13793103 0.14529915]
 [0.15254237 0.15966387 0.16666667 0.17355372 0.18032787 0.18699187]]
数组和数组计算 形状不同(在某个维度上一样)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-v1uFNXwo-1625668777224)(attachment:image.png)]

t9 = np.arange(6)
print(t9)
print(t8)

[0 1 2 3 4 5]
[[100 101 102 103 104 105]
 [106 107 108 109 110 111]
 [112 113 114 115 116 117]
 [118 119 120 121 122 123]]
t8+t9
array([[100, 102, 104, 106, 108, 110],
       [106, 108, 110, 112, 114, 116],
       [112, 114, 116, 118, 120, 122],
       [118, 120, 122, 124, 126, 128]])
t8*t9
array([[  0, 101, 204, 309, 416, 525],
       [  0, 107, 216, 327, 440, 555],
       [  0, 113, 228, 345, 464, 585],
       [  0, 119, 240, 363, 488, 615]])
t9 = np.arange(4).reshape(-1,1)
print(t9)
print(t8)

[[0]
 [1]
 [2]
 [3]]
[[100 101 102 103 104 105]
 [106 107 108 109 110 111]
 [112 113 114 115 116 117]
 [118 119 120 121 122 123]]
t8+t9
array([[100, 101, 102, 103, 104, 105],
       [107, 108, 109, 110, 111, 112],
       [114, 115, 116, 117, 118, 119],
       [121, 122, 123, 124, 125, 126]])

numpy的轴

在numpy中可以理解为方向,使用0,1,2…数字表示,对于一个一维数组,只有一个0轴,对于2维数组(shape(2,2)),有0轴和1轴,对于三维数组(shape(2,2, 3)),有0,1,2轴
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-0kFiEL9r-1625668777225)(attachment:image-2.png)]
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ejqwu52c-1625668777226)(attachment:image-3.png)]

np文件操作

np读取文件

np.loadtxt(fname,dtype=np.float,delimiter=None,skiprows=0,usecols=None,unpack=False)
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-DbJDPlaI-1625668777227)(attachment:image.png)]

numpy转置

t10 = np.arange(24)
t10 = t10.reshape(4,6)
t10
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
t10.T
array([[ 0,  6, 12, 18],
       [ 1,  7, 13, 19],
       [ 2,  8, 14, 20],
       [ 3,  9, 15, 21],
       [ 4, 10, 16, 22],
       [ 5, 11, 17, 23]])
t10.transpose()
array([[ 0,  6, 12, 18],
       [ 1,  7, 13, 19],
       [ 2,  8, 14, 20],
       [ 3,  9, 15, 21],
       [ 4, 10, 16, 22],
       [ 5, 11, 17, 23]])
t10.swapaxes(1,0)
array([[ 0,  6, 12, 18],
       [ 1,  7, 13, 19],
       [ 2,  8, 14, 20],
       [ 3,  9, 15, 21],
       [ 4, 10, 16, 22],
       [ 5, 11, 17, 23]])

numpy的索引和切片

import numpy as np
t10 = np.arange(24)
t10 = t10.reshape(4,6)
t10
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
t10[2,2]
14
t10[2:4,:]
array([[12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
t10[:,2:4]
array([[ 2,  3],
       [ 8,  9],
       [14, 15],
       [20, 21]])
t10[:,[2,4]] # 不相邻的两行
array([[ 2,  4],
       [ 8, 10],
       [14, 16],
       [20, 22]])

numpy中的布尔索引

t10
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
t10<10
array([[ True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True, False, False],
       [False, False, False, False, False, False],
       [False, False, False, False, False, False]])
t10[t10<10] = 3
t10
array([[ 3,  3,  3,  3,  3,  3],
       [ 3,  3,  3,  3, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
t10[t10>20] = 20
t10
array([[ 3,  3,  3,  3,  3,  3],
       [ 3,  3,  3,  3, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 20, 20, 20]])

numpy 三元运算符 np.where()

np.where(t10<10,0,10)
t10
array([[ 3,  3,  3,  3,  3,  3],
       [ 3,  3,  3,  3, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 20, 20, 20]])

numpy中的clip(裁剪)

t10.clip(8,10)
array([[ 8,  8,  8,  8,  8,  8],
       [ 8,  8,  8,  8, 10, 10],
       [10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10]])

numpy中的nan和inf

nan(NAN,Nan):not a number表示不是一个数字
什么时候numpy中会出现nan:

  • 当我们读取本地的文件为float的时候,如果有缺失,就会出现nan
  • 当做了一个不合适的计算的时候(比如无穷大(inf)减去无穷大)

inf(-inf,inf):infinity,inf表示正无穷,-inf表示负无穷
什么时候回出现inf包括(-inf,+inf)

  • 比如一个数字除以0,(python中直接会报错,numpy中是一个inf或者-inf)

nan特殊的属性

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-wsFOq1EM-1625668777229)(attachment:image.png)]

全部替换为0后,替换之前的平均值如果大于0,替换之后的均值肯定会变小,所以更一般的方式是把缺失的数值替换为均值(中值)或者是直接删除有缺失值的一行

使用numpy处理nan值

import numpy as np
t = np.arange(24).reshape(6,4)
t
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])
# 选择行
t[2],t[2:3,:]
(array([ 8,  9, 10, 11]), array([[ 8,  9, 10, 11]]))
# 选择列
t[:,2],t[:,2:3]
(array([ 2,  6, 10, 14, 18, 22]), array([[ 2],
        [ 6],
        [10],
        [14],
        [18],
        [22]]))

numpy中常用统计函数

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-C3Lmivgy-1625668777230)(attachment:image-2.png)]

t10 = np.arange(24)
t10 = t10.reshape(4,6)
print(t10)
print(t10.sum(axis=0)) # 不指定轴则按整个数组查找
print(t10.mean(axis=0))
print(t10.sum(axis=1))
print(t10.mean(axis=1))

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]
[36 40 44 48 52 56]
[ 9. 10. 11. 12. 13. 14.]
[ 15  51  87 123]
[ 2.5  8.5 14.5 20.5]

数组的拼接

t1 = np.arange(12).reshape(2,6)
t2 = np.arange(12,24).reshape(2,6)
t1,t2
(array([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]]), array([[12, 13, 14, 15, 16, 17],
        [18, 19, 20, 21, 22, 23]]))
竖直拼接
np.vstack((t1,t2))
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
水平拼接
np.hstack((t1,t2))
array([[ 0,  1,  2,  3,  4,  5, 12, 13, 14, 15, 16, 17],
       [ 6,  7,  8,  9, 10, 11, 18, 19, 20, 21, 22, 23]])
竖直分割
np.vsplit(t1,(1,1)) #?????
[array([[0, 1, 2, 3, 4, 5]]),
 array([], shape=(0, 6), dtype=int32),
 array([[ 6,  7,  8,  9, 10, 11]])]

numpy 随机数

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-MbaldMVh-1625668777231)(attachment:image.png)]

import numpy as np
# ?数据好像有点不对
root_path = r'D:\File\Learning\Data struct\numpy\DataAnalysis-master\datasourse\视频数据'
us_data = ''

# 加载数据



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值