文章3：numpy

代表太阳晒死你oi

已于 2024-08-16 10:38:35 修改

阅读量632

点赞数 7

分类专栏：数据分析文章标签： numpy python

于 2024-08-16 10:32:09 首次发布

本文链接：https://blog.csdn.net/woailfp1314/article/details/140933905

版权

数据分析专栏收录该内容

3 篇文章 0 订阅

订阅专栏

文章目录

一、numpy基础介绍
二、numpy数据类型
三、数组的创建
四、数组的形状
五、数组的计算
六、文本文件的读取
七、索引和切片操作
八、修改
九、nan和inf
十、其他常见方法
十一、numpy.random成随机数
十二、注意点copy和view

一、numpy基础介绍

学习numpy的原因：快速、方便、科学计算的基础库#numpy是一个在Python中做科学计算的急促库，重在数值计算，也是大部分Python科学计算库的基础库
多用在大型、多维数组上执行数值运算

二、numpy数据类型

在这里插入图片描述

三、数组的创建

import numpy as np
import random

t1 = np.array([1,2,3])
print(f"t1的内容是：{t1}")
print(f"t1的类型是：{type(t1)}")

t2 = np.array(range(10))
print(f"t2的内容是：{t2}")
print(f"t2的类型是：{type(t2)}")

# np.arange方法,用法和range类似,帮助快速生成一堆数字
t3 = np.arange(4,10,2)
print(f"t3的内容是：{t3}")
print(f"t3的类型是：{type(t3)}")

# numpy中的数据类型
t5 = np.array([1,1,0,1,0,0],dtype=bool)
print(t5)
print(f"t5的数据类型是：{t5.dtype}")

t6 = np.array([1,1,0,1,0,0],dtype=float)
print(t6)
print(f"t6的数据类型是：{t6.dtype}")

t7 = np.array([1,1,0,1,0,0],dtype=int)
print(t7)
print(f"t7的数据类型是：{t7.dtype}")

# 调整数据类型,方法是astype
t8 = t7.astype("i1")
print(t8)
print(f"t8的数据类型是：{t8.dtype}")

# numpy中的小数
t7 = np.array([random.random() for i in range(10)])
print(f"t7的内容是：{t7}")
print("t7的数据类型是：{t7.dtype}")

# round指定小数精度
t8 = np.round(t7,2)
print(f"t8的内容是：{t8}")
print(f"t8的数据类型是：{t8.dtype}")

运行结果

t1的内容是：[1 2 3]
t1的类型是：<class 'numpy.ndarray'>
t2的内容是：[0 1 2 3 4 5 6 7 8 9]
t2的类型是：<class 'numpy.ndarray'>
t3的内容是：[4 6 8]
t3的类型是：<class 'numpy.ndarray'>
[ True  True False  True False False]
t5的数据类型是：bool
[1. 1. 0. 1. 0. 0.]
t6的数据类型是：float64
[1 1 0 1 0 0]
t7的数据类型是：int64
[1 1 0 1 0 0]
t8的数据类型是：int8
t7的内容是：[0.90644728 0.38149802 0.29357415 0.63730295 0.35073932 0.74907343
 0.27314706 0.8496759  0.22485442 0.41408235]
t7的数据类型是：{t7.dtype}
t8的内容是：[0.91 0.38 0.29 0.64 0.35 0.75 0.27 0.85 0.22 0.41]
t8的数据类型是：float64

四、数组的形状

import numpy as np

## 数组的形状,打印数组的形状xx.shape
t1 = np.arange(12)
print(t1)
print(f"t1的形状是：{t1.shape}")
print()

t2 = np.array([[1,2,3],[4,5,6]])
print(t2)
print(f"t2的形状是{t2.shape}")
print()

t3 = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
print(t3)
print(f"t3的形状是{t3.shape}")
print()

## 改变数组的形状，方法是reshape(),reshape不会对本身发生改变
## (行,列) or (块,行,列)
t4 = np.arange(12)
print(t4)
print(t4.shape)

t5 = t4.reshape((3,4))
print(t5)

# flatten()是展开成一维的，且不会对本身作用
t6 = t5.flatten()               # 等同于t6=t5.reshape((t5.shape[0]*t5.shape[1],))
print(t6)

五、数组的计算

1、形状相同的数组进行运算，对应位置进行加减乘除
2、形状不同的数组有运算的可能，即广播原则
在这里插入图片描述

t3 = t/0
print(t3)
# nan表示不是一个数字，inf表示无穷
# 输出结果：
# [[nan inf inf inf inf inf]
#  [inf inf inf inf inf inf]
#  [inf inf inf inf inf inf]
#  [inf inf inf inf inf inf]]

# 计算会以整行整列计算
# 要求是在某一维度上是一样的
t1 = np.arange(5)
#[0 1 2 3 4]
t2 = np.arange(5).reshape(5,1)
# [[0]
#  [1]
#  [2]
#  [3]
# [4]]
t3 = np.arange(10).reshape(2,5)
# [[0 1 2 3 4]
#  [5 6 7 8 9]]
t4 = np.arange(30).reshape(6,5)
# [[ 0  1  2  3  4]
#  [ 5  6  7  8  9]
#  [10 11 12 13 14]
#  [15 16 17 18 19]
#  [20 21 22 23 24]
#  [25 26 27 28 29]]
t5 = np.arange(20).reshape(5,4)
# [[ 0  1  2  3  4]
#  [ 5  6  7  8  9]
#  [10 11 12 13 14]
#  [15 16 17 18 19]]
print(t4-t1)
# print(t4-t3)  无法计算
print(t5-t2)

注意是从末尾看起，

t6= np.arange(18).reshape((3,3,2))
t7 = np.arange(9).reshape((3,3))
print(t6+t7)			#显示错误

六、文本文件的读取

np.loadtxt方法是从文本文件里面读内容
np.loadtxt(fname,dtype=np.float,delimiter=None,skiprows=0,usecols=None,unpack=False)

参数名称	参数作用
frame	文件、字符串或产生器
dtype	数据类型，默认np.float
delimiter	分割字符串，默认是任何空格改为逗号
skiprows	跳过前x行，一般跳过第一行表头
usecols	读取指定的列，索引，元组类型
unpack	如果True，读入属性将分别写入不同数组变量，默认false

import numpy as np

us_file_path = "./US_video_data_numbers.csv"
uk_file_path = "./GB_video_data_numbers.csv"

# unpack参数的作用，为True行列互换(转置效果)

t1 = np.loadtxt(us_file_path,delimiter=",",dtype="int",unpack=True)
t2 = np.loadtxt(uk_file_path,delimiter=",",dtype="int")

 print(t1)
print(t2)

其他转置方法

方法名	要求
T	不可传参
transpose	默认情况下的转置可不传参
swapaxes	必须传参

对于三维矩阵，默认转置是(2,1,0)

t1 = np.arange(18).reshape((2,3,3))
t2 = t1.swapaxes(0,2)
t3 = np.swapaxes(t1,0,2)
t4 = t1.transpose(2,1,0)
t5 = np.transpose(t1,(2,1,0))
t6 = t1.T

七、索引和切片操作

# 索引和切片操作，索引从0开始
# 取行
print(t2[2])
# 取连续多行
print(t2[2:])
# 取不连续的多行
print(t2[[2,8,10]])
# 取列
print(t2[1,:]) # 第2行每一列
print(t2[2:,:]) # 第3行后的每一列
print(t2[[2,10,3],:]) # 第2、10、3行的每一列
print(t2[:,0]) # 每一行的第一列
# 取连续的多列
print(t2[:,2:]) #第3列后的每一列
# 取不连续的多列
print(t2[:,[0,2,]]) #每一行的第一列、第三列
# 取单个值，取第3行第4列的值,类型是numpy.int64
print(t2[2,3])
# 取第3行到第5行，第2列到第4列的值
b = t2[2:4,1:3]
# 取多个不相邻的点，选的点是(0,0),(2,1)
c = t2[[0,2],[0,1]]


# 演示数组的行列交换
t = np.arange(12,24).reshape(3,4)
print(t)
# 行交换
t[[1,2],:] = t[[2,1]:,]
print(t)
# 列交换
t[:,[1,2]] = t[:,[2,1]]
print(t)

八、修改

# 数值的修改
t = np.arange(24).reshape(4,6)
# [[ 0  1  2  3  4  5]
#  [ 6  7  8  9 10 11]
#  [12 13 14 15 16 17]
#  [18 19 20 21 22 23]]
t[:,2:4] = 0
print(t)
t3 = t<10
print(t3)

# 打印大于10的值
t4 = t[t>10]
print(t4)

# numpy三元运算符
print(np.where(t<10,0,10))

# clip操作,clip(a,b)小于a的变成a,大于b的变成b
# 但是操作对nan不会有影响
t = np.arange(24).reshape((4,6))
print(t.clip(10,18))
# 将某个值转为nan,(float)t1[x,y] = np.nan (必须是要float)
t = t.astype(float)
t[3,3] = np.nan
print(t)

九、nan和inf

nan(NAN,Nan):not a number表示不是一个数字
什么时候numpy中会出现nan：
当我们读取本地的文件为float的时候，如果有缺失，就会出现nan
当做了一个不合适的计算的时候(比如无穷大(inf)减去无穷大)
inf(-inf,inf):infinity,inf表示正无穷，-inf表示负无穷
什么时候回出现inf，包括（-inf，+inf）
比如一个数字除以0，（python中直接会报错，numpy中是一个inf或者-inf

import numpy as np

## nan的注意点
# 两个nan是不相等的，np.nan!=np.nan
print(np.nan==np.nan)
print(np.nan!=np.nan)
# 利于以上特性，计算数值中nan的个数
t = np.arange(24).reshape(4,6).astype(float)
t[:,0] = 0
print(np.count_nonzero(t)) 
# 方法np.count_nonzero 统计不为0的个数,或为真值的个数
t[3,3] = np.nan
print(np.count_nonzero(t!=t))         # 计算nan的个数
print(np.count_nonzero(np.isnan(t)))     # 计算nan的个数

十、其他常见方法

设其中一个函数为fun，则np.fun(a)与t.fun()等效
这些方法均支持使用轴，但是注意axis=1则是分别求每一行的总和，axis=0则是分别求每一列的总和

函数名称	函数作用
np.sum(a)	求数组a里的数字之和
a.min()	求数组a里的最小值
a.max()	求数组a里的最大值
a.argmin()	找a中最小值的索引值
a.argmax()	找a中最大值的索引值
a.mean()	计算a中元素的平均值
np.average(a)	计算a中元素的平均值
np.median(a)	找a中的中位数
np.cumsum(a)	前缀和，即前n项和
np.diff(a)	后一个数减前一个数，b1 = a2 - a1，b2= a3 - a2…
np.sort(a)	给a排序
np.ptp(t)	极差，最大值和最小值只差
t.std()	标准差
np.zeros((3, 4))	创建一个全为0的数组
np.ones((3, 4))	创建一个全为0的数组
np.eye(3)	创建一个对角线为1的正方形数组
np.vstack((t1,t2))	竖直拼接
np.hstack((t1,t2))	水平拼接
np.vsplit(a,2))	竖直分割，数字的意思是分成几份（平均分）
np.hsplit(a,2))	水平分割，，数字的意思是分成几份（平均分）

十一、numpy.random成随机数

一、生成均匀分布的随机数

numpy.random.rand(d0, d1,..., dn)：
- 生成一个给定形状的数组，数组中的元素服从[0,1)之间的均匀分布。
- 例如：np.random.rand(2, 3)会生成一个 2 行 3 列的数组，其中的元素都是在 0 到 1 之间的随机数。
numpy.random.uniform(low=0.0, high=1.0, size=None)：
- 可以指定随机数的范围，生成在low（包含）到high（不包含）之间服从均匀分布的随机数。
- size参数可以指定生成随机数的形状。
- 例如：np.random.uniform(2, 5, size=(3, 4))会生成一个 3 行 4 列的数组，其中的元素都是在 2 到 5 之间的随机数。

二、生成正态分布的随机数

numpy.random.randn(d0, d1,..., dn)：
- 生成一个给定形状的数组，数组中的元素服从标准正态分布（均值为 0，标准差为 1）。
- 例如：np.random.randn(2, 3)会生成一个 2 行 3 列的数组，其中的元素服从标准正态分布。
numpy.random.normal(loc=0.0, scale=1.0, size=None)：
- 可以指定正态分布的均值loc和标准差scale，生成服从指定正态分布的随机数。
- size参数可以指定生成随机数的形状。
- 例如：np.random.normal(5, 2, size=(4, 5))会生成一个 4 行 5 列的数组，其中的元素服从均值为 5，标准差为 2 的正态分布。

三、生成整数随机数

numpy.random.randint(low, high=None, size=None, dtype=int)：
- 生成在low（包含）到high（不包含）之间的整数随机数。
- 如果只指定low，则生成在[0,low)之间的整数随机数。
- size参数可以指定生成随机数的形状。
- dtype参数可以指定生成的整数类型。
- 例如：np.random.randint(2, 10, size=(3, 3))会生成一个 3 行 3 列的数组，其中的元素都是在 2 到 10 之间的随机整数。

四、随机数种子

可以使用numpy.random.seed()方法设置随机数种子，使得每次运行代码时生成的随机数序列相同。这在需要复现实验结果时非常有用。

例如：

np.random.seed(0)
a = np.random.rand(3, 3)
np.random.seed(0)
b = np.random.rand(3, 3)
print(a)
print(b)
print(np.all(a == b))  # True，说明两次生成的随机数相同

十二、注意点copy和view

1.a=b 完全不复制, a和b相互影响
2.a=b[:] 视图的操作, 一种切片, 会创建新的对象a, 但是a的数据完全由b保管, 他们的数据变化是一致的
3.a=b.copy() 复制, ab不影响

代表太阳晒死你oi

关注

7
点赞
踩
13

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录