笔记参照课程唐宇迪python数据分析与机器学习实战
笔记方便自己今后回顾和查看,需要详细了解各自numpy操作,建议学习上述课程
代码https://github.com/Jingyaozhou/numpy_introduction.git
基础操作
matrix 和 vector 定义
import numpy as np
#define vector
vertor=np.array([5,10,15,20])
print vertor
#define the matrix
matrix=np.array([[5,10,15],[20,25,30]])
#output the shape
print matrix.shape
#output the type of data
print matrix.dtype
output
[ 5 10 15 20]
(2, 3)
int64
np.array() 定义vector 或者 matrix
a.shape a的尺寸,若matrix为 行数和列数
a.dtype a内数据类型
数据结构
数据类型
numpy中一个array内所有的数据类型都应该相同
切片
切片方式类似与matlab,注意python中0:3取值为0,1,2与matlab有所区别
print matrix[:,[0,2]]
print matrix[:,0:2]
output
[[ 5 15]
[20 30]]
[[ 5 10]
[20 25]]
默认bsxfun
python中默认的操作为matlab中的bsxfun
vertor==10 #using this bool value as index
output
array([False, True, False, False], dtype=bool)
matrix==25 #using this bool value as index
output
array([[False, False, False],
[False, True, False]], dtype=bool)
bool值可以直接作为index
Numpy矩阵基础
逻辑运算
print (vertor==10)&(vertor==5)
print (vertor==10) | (vertor==5)
output
[False False False False]
[ True True False False]
类型转换
vector1=np.array(['1','2','3'])
print vector1.dtype
print vector1
vector1=vector1.astype(float)
print vector1
print vector1.dtype
output
|S1
['1' '2' '3']
[ 1. 2. 3.]
float64
求极值
vertor.min()
output
5
求和
print matrix.sum(axis=1)
print matrix.sum(axis=0)
output
[30 75]
[25 35 45]
Numpy 常用函数
matrix vector 规格重塑 reshape()
print np.arange(15)
a=np.arange(15).reshape(3,5)
print a
output
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14]
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
维度 ndim()
a.ndim #dimension
output
2
元素个数size()
a.size
output
15
元素类型返回str dtype.name
a.dtype.name
output
'int64'
初始化zeros() ones() random. arrange() linspace()
np.zeros((3,4))
output
array([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]])
np.ones((2,3,4),dtype=np.int32)
output
array([[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]], dtype=int32)
np.arange(10,30,5)
output
array([10, 15, 20, 25])
np.random.random((2,3))
output
array([[ 0.60180048, 0.17302083, 0.43838994],
[ 0.43108949, 0.0730318 , 0.58841759]])
np.linspace(0,2*np.pi,100)
output
array([ 0. , 0.06346652, 0.12693304, 0.19039955, 0.25386607,
0.31733259, 0.38079911, 0.44426563, 0.50773215, 0.57119866,
0.63466518, 0.6981317 , 0.76159822, 0.82506474, 0.88853126,
0.95199777, 1.01546429, 1.07893081, 1.14239733, 1.20586385,
1.26933037, 1.33279688, 1.3962634 , 1.45972992, 1.52319644,
1.58666296, 1.65012947, 1.71359599, 1.77706251, 1.84052903,
1.90399555, 1.96746207, 2.03092858, 2.0943951 , 2.15786162,
2.22132814, 2.28479466, 2.34826118, 2.41172769, 2.47519421,
2.53866073, 2.60212725, 2.66559377, 2.72906028, 2.7925268 ,
2.85599332, 2.91945984, 2.98292636, 3.04639288, 3.10985939,
3.17332591, 3.23679243, 3.30025895, 3.36372547, 3.42719199,
3.4906585 , 3.55412502, 3.61759154, 3.68105806, 3.74452458,
3.8079911 , 3.87145761, 3.93492413, 3.99839065, 4.06185717,
4.12532369, 4.1887902 , 4.25225672, 4.31572324, 4.37918976,
4.44265628, 4.5061228 , 4.56958931, 4.63305583, 4.69652235,
4.75998887, 4.82345539, 4.88692191, 4.95038842, 5.01385494,
5.07732146, 5.14078798, 5.2042545 , 5.26772102, 5.33118753,
5.39465405, 5.45812057, 5.52158709, 5.58505361, 5.64852012,
5.71198664, 5.77545316, 5.83891968, 5.9023862 , 5.96585272,
6.02931923, 6.09278575, 6.15625227, 6.21971879, 6.28318531])
数学操作
A=np.array([[1,1],
[1,0]])
B=np.array([[2,3],
[3,2]])
print A*B
print np.dot(A,B)
output
[[2 3]
[3 0]]
[[5 5]
[2 3]]
print np.exp(B)
print np.sqrt(B)
output
[[ 7.3890561 20.08553692]
[ 20.08553692 7.3890561 ]]
[[ 1.41421356 1.73205081]
[ 1.73205081 1.41421356]]
floor 向下取整
a=np.floor(10*np.random.random((3,4)))
print a
output
[[ 1. 2. 9. 7.]
[ 3. 7. 6. 9.]
[ 7. 2. 6. 8.]]
矩阵操作
矩阵拼接 分割
a=np.floor(10*np.random.random((2,2)))
b=np.floor(10*np.random.random((2,2)))
print a
print b
print np.hstack((a,b))
print np.vstack((a,b))
output
[[ 4. 4.]
[ 3. 1.]]
[[ 6. 4.]
[ 5. 8.]]
[[ 4. 4. 6. 4.]
[ 3. 1. 5. 8.]]
[[ 4. 4.]
[ 3. 1.]
[ 6. 4.]
[ 5. 8.]]
a=np.floor(10*np.random.random([2,12]))
print a
print np.hsplit(a,3)
print np.hsplit(a,(3,5))
output
[[ 8. 0. 1. 6. 4. 4. 8. 7. 5. 2. 8. 5.]
[ 7. 8. 3. 7. 8. 3. 2. 9. 2. 5. 2. 0.]]
[array([[ 8., 0., 1., 6.],
[ 7., 8., 3., 7.]]), array([[ 4., 4., 8., 7.],
[ 8., 3., 2., 9.]]), array([[ 5., 2., 8., 5.],
[ 2., 5., 2., 0.]])]
[array([[ 8., 0., 1.],
[ 7., 8., 3.]]), array([[ 6., 4.],
[ 7., 8.]]), array([[ 4., 8., 7., 5., 2., 8., 5.],
[ 3., 2., 9., 2., 5., 2., 0.]])]
其他
python 赋值!!!非常重要 = view() copy()
a=np.arange(12)
b=a
print (b is a)
b.shape=3,4
print a.shape
print(id(a))
print (id(b))
out
True
(3, 4)
140565407561648
140565407561648
c=a.view()
print (c is a)
c.shape=2,6
print a.shape
c[0,4]=1234
print a
print(id(c))
out
False
(3, 4)
[[ 0 1 2 3]
[1234 5 6 7]
[ 8 9 10 11]]
140565407562448
d=a.copy()
d is a
d[0,0]=999
print d
print a
output
[[ 999 1 2 3]
[1234 5 6 7]
[ 8 9 10 11]]
[[ 0 1 2 3]
[1234 5 6 7]
[ 8 9 10 11]]
查找index
data=np.sin(np.arange(20).reshape(5,-1))
print (data)
ind=data.argmax(axis=0)
print ind
data_max=data[ind,range(data.shape[1])]
print data_max
output
[[ 0. 0.84147098 0.90929743 0.14112001]
[-0.7568025 -0.95892427 -0.2794155 0.6569866 ]
[ 0.98935825 0.41211849 -0.54402111 -0.99999021]
[-0.53657292 0.42016704 0.99060736 0.65028784]
[-0.28790332 -0.96139749 -0.75098725 0.14987721]]
[2 0 3 1]
[ 0.98935825 0.84147098 0.99060736 0.6569866 ]
复制
a=np.arange(0,40,10)
print a
b=np.tile(a,(2,2))
print b
output
[ 0 10 20 30]
[[ 0 10 20 30 0 10 20 30]
[ 0 10 20 30 0 10 20 30]]
排序
a=np.array([4,3,1,2])
j=np.argsort(a)
print a[j]
output
[1 2 3 4]