最近在学习机器学习的课程,想把整个学习流程的笔记记录在这里,大家可以作为一个查阅基本概念的地方,由于学识尚浅肯定会有或多或少的问题,欢迎大家在评论中提出来一起讨论。Numpy作为机器学习的一个必不可少的基础库非常重要,我把Numpy的基本操作概括了以下19点,欢迎大家查阅。
1.numpy array的创建
#生成一维列表
in:nparr = np.array([i for i in range(10)])
nparr
out:array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
#返回列表的类型
in:nparr.dtype
out:dtype('int32')
#生成全为0的列表
in:np.zeros(10)
out:array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
#生成全为0的矩阵
in:np.zeros((3,5))
out:array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
#生成全为1的矩阵
in:np.ones((3,5))
out:array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
#用一个数字充满矩阵
in:np.full((3,5),fill_value=666)
out:array([[666, 666, 666, 666, 666],
[666, 666, 666, 666, 666],
[666, 666, 666, 666, 666]])
2.arrage
#生成0-20的数,间隔为2
in:np.arange(0,20,2)
out:array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
#生成0-10
in:np.arange(0,10)
out:array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
3.linspace
#生成0-20 首数为0,尾数为20,等间隔的十个数
in:np.linspace(0,20,10)
out:array([0.,2.22222222,4.44444444,6.66666667,8.88888889,11.11111111, 13.33333333, 15.55555556, 17.77777778, 20.])
#同上 生成11个数
in:np.linspace(0,20,11)
out:array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20.])
4.random
#随机生成0-10之间的一个整数
in:np.random.randint(0,10)
out:2
#随机生成0-10之间的10个整数
in:np.random.randint(0,10,size=10)
out:array([9, 2, 2, 4, 0, 8, 6, 7, 6, 6])
#随机生成0-10之间的3*5的矩阵
in:np.random.randint(0,10,size=(3,5))
out:array([[4, 7, 9, 3, 6],
[2, 2, 3, 8, 1],
[6, 5, 5, 9, 0]])
#可以指定随机种子使前后两次生成的随机矩阵相同
in:np.random.seed(666)
np.random.randint(0,10,size=(3,5))
out:array([[2, 6, 9, 4, 3],
[1, 0, 8, 7, 5],
[2, 5, 5, 4, 8]])
#随机生成-1~1之间的浮点数
in:np.random.random()
out;0.7315955468480113
#生成均值为0,标准差为1,size=3*5的矩阵
in:np.random.normal(0,1,size=(3,5))
out:array([[-1.62879004, 1.23174866, -0.91360034, -0.27084407, 1.42024914],
[-0.98226439, 0.80976498, 1.85205227, 1.67819021, -0.98076924],
[ 0.47031082, 0.18226991, -0.84388249, 0.20996833, 0.22958666]])
5.reshape
in:x = np.arange(10)
x
out:array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
in:X = np.arange(15).reshape(3,5)
X
out:array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
#reshape可以传一个参数-1,下例默认改变x的列数对应x的行数,即两行对应5列
in:x.reshape(2,-1)
out:array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
6.基本属性
#返回维度(一个数字)
in:x.ndim
out:1
in:X.ndim
out:2
#返回各维度大小的元祖
in:x.shape
out:(10,)
in:X.shape
out:(3, 5)
#返回矩阵元素个数
in:x.size
out:10
in:X.size
out:15
7.数据的访问
#首先看看我们的x和X
x=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
X=array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
#先看对于一维数组的访问(即x)
in:x[0]
out:0
in:x[-1]
out:9
in:x[0:5]
out:array([0, 1, 2, 3, 4])
in:[::2]
out:array([0, 2, 4, 6, 8])
in:x[::-1]
out:array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
#这些都是一些比较基本的操作,大家应该都能看懂,接下来看二维矩阵的操作(即X)
#前两行前三列
in:X[:2,:3]
out:array([[0, 1, 2],
[5, 6, 7]])
#前两行,隔一行取一列
in:X[:2,::2]
out:array([[0, 2, 4],
[5, 7, 9]])
#全部倒序
in:X[::-1,::-1]
out:array([[14, 13, 12, 11, 10],
[ 9, 8, 7, 6, 5],
[ 4, 3, 2, 1, 0]])
#X后的中括号带一个数字,代表取一列
in:X[0]
out:array([0, 1, 2, 3, 4])
in:X[0,:]
out:array([0, 1, 2, 3, 4])
8.浅拷贝和深拷贝
#首先看一下浅拷贝
in:subX = X[:2,:3]
subX
out:array([[0, 1, 2],
[5, 6, 7]])
#当我们修改某个元素时
in:subX[0,0]=100
subX
out:array([[100, 1, 2],
[ 5, 6, 7]])
#同时访问X,发现X的数据也跟随修改subX时一起改变了
in:X
out:array([[100, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[ 10, 11, 12, 13, 14]])
#如何做到深拷贝呢
in:subX = X[:2,:3].copy()
subX
out:array([[0, 1, 2],
[5, 6, 7]])
#这时候subX就是一个新的矩阵了,修改他的时候就不会影响我们原来的X了。
9.合并操作
in:x = np.array([1,2,3])
y = np.array([3,2,1])
x
y
out:array([1, 2, 3])
array([3, 2, 1])
#首先我们完成纵向的堆叠
in:z = np.vstack([x,y])
out:array([[1, 2, 3],
[3, 2, 1]])
#然后完成横向的堆叠
in: w = np.hstack([x,y])
out:array([1, 2, 3, 3, 2, 1])
10.分割操作
in:A = np.arange(16).reshape((4,4))
A
out:array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
#首先对A进行纵向的分割
in:upper,lower = np.vsplit(A,[2])
upper
out:array([[0, 1, 2, 3],
[4, 5, 6, 7]])
#这里传的2代表‘刀口’的位置,即竖着数在第3个位置开刀,当然也可以传-1代表从倒数第二个位置进行切割
#然后对A进行横向的分割
in:upper,lower = np.hsplit(A,[2])
upper
out:array([[ 0, 1],
[ 4, 5],
[ 8, 9],
[12, 13]])
11.numpy.array中的运算(也称作Universal Functions)
#首先生成我们一个X
in:X= np.arange(1,16).reshape((3,5))
X
out:array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15]])
#下面就比较简单了,大家一眼就能看懂
in:X +1
out:array([[ 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16]])
in:X - 1
out:array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
in:X *2
out:array([[ 2, 4, 6, 8, 10],
[12, 14, 16, 18, 20],
[22, 24, 26, 28, 30]])
in:X /2
out:array([[0.5, 1. , 1.5, 2. , 2.5],
[3. , 3.5, 4. , 4.5, 5. ],
[5.5, 6. , 6.5, 7. , 7.5]])
#取整运算
in:X//2
out:array([[0, 1, 1, 2, 2],
[3, 3, 4, 4, 5],
[5, 6, 6, 7, 7]], dtype=int32)
#乘方运算
in:X**2
out:array([[ 1, 4, 9, 16, 25],
[ 36, 49, 64, 81, 100],
[121, 144, 169, 196, 225]], dtype=int32)
#取余运算
in:X%2
out:array([[1, 0, 1, 0, 1],
[0, 1, 0, 1, 0],
[1, 0, 1, 0, 1]], dtype=int32)
#取绝对值
in:np.abs(X)
out:array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15]])
#取sin
in:np.sin(X)
out:array([[ 0.84147098, 0.90929743, 0.14112001, -0.7568025 , -0.95892427],
[-0.2794155 , 0.6569866 , 0.98935825, 0.41211849, -0.54402111],
[-0.99999021, -0.53657292, 0.42016704, 0.99060736, 0.65028784]])
#取e的X次方
in:np.exp(X)
out:array([[2.71828183e+00, 7.38905610e+00, 2.00855369e+01, 5.45981500e+01,
1.48413159e+02],
[4.03428793e+02, 1.09663316e+03, 2.98095799e+03, 8.10308393e+03,
2.20264658e+04],
[5.98741417e+04, 1.62754791e+05, 4.42413392e+05, 1.20260428e+06,
3.26901737e+06]])
#取3的X次方
in:np.power(3,X)
out:array([[ 3, 9, 27, 81, 243],
[ 729, 2187, 6561, 19683, 59049],
[ 177147, 531441, 1594323, 4782969, 14348907]], dtype=int32)
#取log
in:np.log(X)
out:array([[0. , 0.69314718, 1.09861229, 1.38629436, 1.60943791],
[1.79175947, 1.94591015, 2.07944154, 2.19722458, 2.30258509],
[2.39789527, 2.48490665, 2.56494936, 2.63905733, 2.7080502 ]])
#取以10为底的log
in:np.log10(X)
out:array([[0. , 0.30103 , 0.47712125, 0.60205999, 0.69897 ],
[0.77815125, 0.84509804, 0.90308999, 0.95424251, 1. ],
[1.04139269, 1.07918125, 1.11394335, 1.14612804, 1.17609126]])
12.矩阵运算
#矩阵运算包括两种一定要区分
in:A = np.arange(4).reshape(2,2)
A
out:array([[0, 1],
[2, 3]])
in:B = np.full((2,2),10)
B
out:array([[10, 10],
[10, 10]])
#对应位置的元素做乘法
in:A * B
out:array([[ 0, 10],
[20, 30]])
#线性代数中的矩阵乘法
in:A.dot(B)
out:array([[10, 10],
[50, 50]])
#转置
in:A.T
out:array([[0, 2],
[1, 3]])
13.向量和矩阵的运算
in:v = np.array([1,2])
A = array([[0, 1],
[2, 3]])
np.vstack([v]*A.shape[0])
out:array([[1, 2],
[1, 2]])
in:v *A
out: array([[0, 2],
[2, 6]])
in:v.dot(A)
out:array([4, 7])
14.矩阵的逆
in:A
out:array([[0, 1],
[2, 3]])
in:np.linalg.inv(A)
out:array([[-1.5, 0.5],
[ 1. , 0. ]])
15.聚合操作
in:L= np.random.random(100)
sum(L)
np.sum(L)
out:47.51077024043275
47.51077024043275
in:np.min(L)
L.min()
out:1.827632976070248e-06
1.827632976070248e-06
in:np.max()
L.max()
out:0.9999993748010294
0.9999993748010294
#来试试二维矩阵
in:X = np.arange(16).reshape(4,-1)
X
out:array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
in:np.sum(X)
out:120
#竖向求和
in:np.sum(X,axis=0)
out:array([24, 28, 32, 36])
#横向求和
in:np.sum(X,axis=1)
out:array([ 6, 22, 38, 54])
#求平均
in:np.mean(X)
out:7.5
#求中位数
in:np.median(X)
out:7.5
#求方差
in:np.var(big_array)
out:0.08341610471085403
#求标准差
in:np.std(big_array)
out:0.28881846324439514
#来个例子
in:x = np.random.normal(0,1,size=1000000)
in:np.mean(x)
out:-0.0020024202334790993 #很接近0了
in:np.std(x)
out:1.0000840912045612 #很接近1了
16.索引
#返回最小值的索引
in:np.argmin(x)
out:610261
#返回最大值的索引
in:np.argmax(x)
out:849782
17.排序和使用索引
in:x = np.arange(16)
x
out:array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
#打乱一下
in:np.random.shuffle(x)
x
out:array([ 9, 3, 14, 11, 8, 6, 1, 15, 5, 13, 10, 2, 7, 12, 0, 4])
#排好顺序
in:x.sort()
out:array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
#二维怎么排序呢
in:X =np.random.randint(10,size=(4,4))
X
out:array([[4, 0, 9, 7],
[9, 2, 9, 8],
[7, 5, 6, 8],
[3, 3, 1, 7]])
#按行排序
in:np.sort(X,axis = 1)
out:array([[0, 4, 7, 9],
[2, 8, 9, 9],
[5, 6, 7, 8],
[1, 3, 3, 7]])
#按列排序
in:np.sort(X,axis = 0)
out:array([[3, 0, 1, 7],
[4, 2, 6, 7],
[7, 3, 9, 8],
[9, 5, 9, 8]])
#argsort返回排序好的数值所在的索引
in:x = array([12, 1, 14, 7, 9, 11, 3, 15, 10, 5, 8, 0, 13, 2, 4, 6])
np.argsort(x)
out:array([11, 1, 13, 6, 14, 9, 15, 3, 10, 4, 8, 5, 0, 12, 2, 7],
dtype=int64)
#二维数组也可以用argsort(按行排序)
in:X = array([[4, 0, 9, 7],
[9, 2, 9, 8],
[7, 5, 6, 8],
[3, 3, 1, 7]])
np.argsort(X,axis=1)
out:array([[1, 0, 3, 2],
[1, 3, 0, 2],
[1, 2, 0, 3],
[2, 0, 1, 3]], dtype=int64)
18.Fancy Indexing
#这个很重要!
in:x = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
x[3:9:2]
out:array([3, 5, 7])
in:ind = [3,5,8]
x[ind]
out:array([3, 5, 8])
in:X = x.reshape(4,-1)
X
out:array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
in:row = np.array([0,1,2])
col = np.array([1,2,3])
X[row,col]
out:array([ 1, 6, 11])
#看明白了吗?取了X的第0,1,2行,以及1,2,3列对应的元素
in:col = [True,False,True,True]
X[1:3,col]
out:array([[ 4, 6, 7],
[ 8, 10, 11]])
#取X的1,2行,列呢?True就要False就不要
19.numpy.array的比较
in:x=array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
x<3
out:array([ True, True, True, False, False, False, False, False, False,
False, False, False, False, False, False, False])
in:x ==3
out:array([False, False, False, True, False, False, False, False, False,
False, False, False, False, False, False, False])
in:X = array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
X<6
out:array([[ True, True, True, True],
[ True, True, False, False],
[False, False, False, False],
[False, False, False, False]])
in:x = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
np.sum(x<=3)
out:4
in:np.count_nonzero(x <=3)
out:4
#只要有0就返回true
in:np.any(x == 0)
out:True
#全为0返回true
in:np.all(x>0)
out:False
#x大于3且小于6
in:x = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
np.sum( (x>3) & (x<6))
out:2
#x大于3或者x小于6
in:np.sum( (x>3) | (x<6))
out:16
#x不等于0
in:np.sum(~(x==0))
out:15
#fancy indexing的使用
in:x[x<5]
out:array([0, 1, 2, 3, 4])
in:x[x%2==0]
out:array([ 0, 2, 4, 6, 8, 10, 12, 14])
in:X = array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
X[X[:,3]%3==0,:]
out:array([[ 0, 1, 2, 3],
[12, 13, 14, 15]])
numpy的使用远不止这些,但差不多囊括了我们经常使用的,不一定要全部记下来,可以边用边查就可以让我们记得很深刻了~