数据神器 — numpy
导入numpy库
import numpy as np
并查看numpy版本
np.__version__
'1.21.2'
numpy ndarray n dimension array,ndarray不是真正的矩阵,一切皆矩阵
一、创建ndarray
1.由python list创建
l = [1, 2, 3, 4]
n = np.array(l)
n
array([1, 2, 3, 4])
n[0] = 8
n
array([8, 2, 3, 4])
注意:
- numpy默认ndarray的所有元素的类型是相同的
- 如果传进来的列表中包含不同的类型,则统一为同一类型,优先级:str>float>int
np.array([1,2,3.9,'4.5']) # str > float > int
array(['1', '2', '3.9', '4.5'], dtype='<U32')
2. 使用np的routines函数创建
ones
创建全是1的ndarray,默认是float64类型
- np.ones(shape, dtype=None, order=‘C’)
np.ones(shape=(4,5))
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
zeros
创建全是0的ndarray
- np.zeros(shape, dtype=float, order=‘C’)
np.zeros(shape=(3,4,5)) # 3个 4行5列
array([[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]]])
np.zeros(shape=(8,9)) # 8行9列
array([[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.]])
full
使用指定的元素来填充
- np.full(shape, fill_value, dtype=None, order=‘C’)
np.full(shape=(3,4), fill_value=8, dtype=np.float64)
array([[8., 8., 8., 8.],
[8., 8., 8., 8.],
[8., 8., 8., 8.]])
eye
生成对主角线上全是1, 其他位置全是0 的二维ndarray,
N 行 M 列 默认N=M,控制对角线的移动,正数向上移动,负数向下移动
- np.eye(N, M=None, k=0, dtype=float)
np.eye(4, 5)
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.]])
np.eye(8, k=-1)
array([[0., 0., 0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 1., 0.]])
行列数相同的矩阵叫做方阵, 主对角线全是1, 其他位置全是0的方阵叫做单位矩阵.单位矩阵相当于实数里的1, 乘于任何矩阵,等于矩阵本身.
linespace
等分范围
- np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)
np.linspace(0, 100, 20)
array([ 0. , 5.26315789, 10.52631579, 15.78947368,
21.05263158, 26.31578947, 31.57894737, 36.84210526,
42.10526316, 47.36842105, 52.63157895, 57.89473684,
63.15789474, 68.42105263, 73.68421053, 78.94736842,
84.21052632, 89.47368421, 94.73684211, 100. ])
np.linspace(0, 100, num=20, endpoint=False)
array([ 0., 5., 10., 15., 20., 25., 30., 35., 40., 45., 50., 55., 60.,
65., 70., 75., 80., 85., 90., 95.])
np.linspace(0, 100, num=20, endpoint=False, retstep=True)
(array([ 0., 5., 10., 15., 20., 25., 30., 35., 40., 45., 50., 55., 60.,
65., 70., 75., 80., 85., 90., 95.]),
5.0)
arange
- np.arange([start, ]stop, [step, ]dtype=None)
# 和python的 range一样
np.arange(0, 100)
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])
当step是非整数的时候,尽量使用np.linspace
random
- np.random.randint(low, high=None, size=None, dtype=‘l’)
# 全是整数的ndarray 左闭右开区间.
np.random.randint(0, 150, size=(4,6,3) )
array([[[ 67, 26, 146],
[ 34, 42, 78],
[ 38, 137, 36],
[ 11, 20, 128],
[126, 104, 8],
[ 2, 71, 59]],
[[146, 68, 139],
[126, 3, 120],
[ 77, 130, 145],
[119, 88, 70],
[ 34, 104, 92],
[ 60, 86, 24]],
[[ 25, 84, 121],
[ 39, 89, 144],
[ 97, 49, 138],
[ 69, 125, 99],
[ 60, 74, 9],
[ 62, 36, 27]],
[[120, 7, 103],
[132, 113, 67],
[ 18, 65, 9],
[116, 63, 81],
[ 24, 84, 80],
[133, 117, 117]]])
- np.random.randn(d0, d1, …, dn)
标准正太分布
# 平均值为0, 方差 为1 的正态分布叫做标准正态分布
np.random.randn(4, 5, 3,6)
array([[[[ 3.60896057e-01, 1.23980560e-01, 1.37519592e+00,
1.06533135e+00, -8.35433298e-01, -1.52779526e+00],
[-2.09415980e+00, 2.30864644e+00, -2.59418955e-01,
1.46912849e+00, 7.24885652e-01, -5.93176814e-01],
[ 8.22369584e-01, -3.94825830e-01, 7.73009631e-01,
-1.02187472e+00, 1.56807963e+00, -1.09462136e+00]],
[[ 1.26279597e+00, 2.48090559e-01, -1.70231285e+00,
-6.47952721e-01, 7.77604101e-01, 3.85913462e-01],
[-7.93193217e-01, -4.95194049e-01, -3.89481018e-01,
3.94739260e-01, -1.34159405e+00, -4.94484424e-01],
[-1.64441353e-01, -3.91084313e-01, -8.61262242e-01,
-3.38360993e-01, 1.68766962e+00, 4.14999509e-01]],
[[ 1.06238939e+00, 4.73559653e-01, 8.92111604e-01,
-6.69835308e-01, -1.30740571e-01, 2.92519486e-01],
[-1.54260281e+00, 1.04506055e+00, -1.52782308e+00,
5.43280242e-01, 7.25304358e-01, 1.08721075e+00],
[-6.95519417e-01, 1.33071212e+00, -5.23520898e-01,
1.08361779e-01, -5.07076415e-01, -5.62254578e-01]],
[[-1.67084570e+00, -4.27674323e-01, 3.66569193e-01,
-5.31608548e-01, 1.34716890e+00, -1.45857954e+00],
[-1.33056590e+00, -8.88246047e-01, 7.45925119e-01,
-8.61176518e-01, -3.59061524e-02, 1.41474295e+00],
[-5.86663731e-01, 2.09535207e-01, -1.94074953e+00,
1.44544072e+00, 1.03266146e+00, -6.86201237e-01]],
[[ 4.53487702e-01, -8.77835365e-02, -1.68401428e+00,
-4.00786578e-01, -1.17030967e+00, -1.04548773e+00],
[ 4.45880286e-01, 2.03892073e+00, -6.20225131e-01,
1.06125730e+00, -1.92812916e+00, -3.53081752e-01],
[ 3.83867909e-01, -1.35043486e+00, 9.55760858e-02,
2.36644094e-01, 8.94760214e-01, 1.32577371e+00]]],
[[[ 3.41774768e-01, -5.07251722e-01, 6.47109094e-01,
1.49127934e+00, 5.86152309e-01, 4.72069219e-01],
[ 7.00466672e-01, 3.64796859e-01, 1.39140325e+00,
-6.92112895e-01, -1.64790762e+00, -7.84703538e-01],
[ 4.20582525e-01, 1.74007064e+00, -5.47389642e-01,
7.45009069e-01, 7.85491916e-01, -5.65628010e-02]],
[[-1.09574102e+00, -9.89377263e-02, -5.60805984e-01,
3.19714353e-01, 6.54734867e-01, 5.88010175e-01],
[ 7.67646148e-01, -1.48458591e+00, 2.53820658e-01,
-1.55131464e-01, -4.66670151e-01, -9.74345388e-01],
[-6.30116378e-01, -1.16017674e-01, -2.57644874e-01,
-1.05609469e+00, 7.15304230e-01, -2.36875019e-03]],
[[-1.68874591e+00, -9.25072607e-02, 2.64519122e-01,
-4.95177820e-01, 4.75458971e-01, -1.52298033e+00],
[ 2.04787909e-01, 1.00570251e+00, -8.33600974e-01,
8.87137367e-01, 2.06950466e-01, -1.58276796e+00],
[ 2.25577683e-02, -8.88912336e-01, -8.14031401e-02,
-1.87347545e-01, -9.13539778e-01, 6.49910092e-01]],
[[-3.90855074e-01, -7.54551181e-01, -4.09166816e-01,
-3.20466207e-01, -1.71886024e-01, 1.98676750e+00],
[-1.13942835e+00, 2.17108400e-01, 1.43636425e+00,
7.30069588e-02, -4.01371010e-01, 3.42695061e-02],
[-3.08407225e-01, -1.34765866e+00, 7.62957514e-01,
1.02028976e+00, -5.17721304e-01, 2.05465465e+00]],
[[ 2.88877684e-01, -1.23720613e+00, 1.35743090e+00,
7.56117379e-01, 1.00806657e+00, -1.99802096e-01],
[ 2.37915448e-02, -8.16784340e-01, -8.01664376e-01,
3.43102836e-01, -5.10517741e-01, -1.11172049e+00],
[-8.82476315e-01, 5.45329811e-01, 8.11862987e-01,
-2.20628912e+00, 7.31520974e-01, 1.65758152e-01]]],
[[[-7.83560920e-01, -1.07341259e+00, -1.43857260e+00,
-4.99204048e-01, 5.90535792e-01, -9.38809511e-02],
[ 1.53623097e-01, 6.98306624e-02, 1.16393476e+00,
1.11688799e+00, -4.60762440e-01, -7.83346388e-01],
[ 2.39507032e+00, 2.22173265e+00, 1.46713573e+00,
1.52938356e-01, 1.11735263e+00, -1.62251923e+00]],
[[ 2.03821895e-01, -6.48150614e-01, -5.63266733e-01,
1.26326842e+00, 1.99513121e+00, 3.63142761e-01],
[-1.46342808e+00, 3.37923031e-01, 9.34881130e-01,
4.02548825e-01, -2.23021532e-01, -4.10635526e-01],
[-6.73256314e-01, 7.41047855e-01, 4.46527638e-01,
2.53437353e-02, 1.42617770e+00, 6.92066347e-03]],
[[-6.20812478e-01, -1.01847913e+00, 1.20138769e+00,
-1.25566447e-01, 5.33581158e-01, 2.70359161e-01],
[ 2.94938800e-01, 2.45388819e-01, 6.79539342e-01,
-7.43358208e-01, 1.02354308e-01, 2.05836865e-01],
[-8.16649205e-01, 7.04232499e-01, 4.89938147e-01,
4.49427494e-03, 8.80719114e-01, -7.19515454e-01]],
[[-7.95980534e-01, -4.10690579e-01, 1.05298217e-02,
1.52497635e-01, -5.34085341e-01, 1.40078164e+00],
[-7.67609740e-01, -3.84409263e-01, -5.36989131e-02,
1.21021983e+00, 4.53144185e-01, 9.87691721e-01],
[-8.24691074e-01, 3.55140028e-01, -1.65582483e+00,
-1.39084153e+00, -1.05462704e+00, -2.46805400e+00]],
[[-1.14399395e-01, 9.38177343e-01, 9.91415690e-02,
-1.55338969e+00, -1.96468812e-01, 6.47261682e-01],
[-3.64097218e-01, -5.56977543e-01, 7.64683972e-01,
-4.13289450e-01, 3.78971775e-01, -1.06952959e-01],
[-2.22163641e+00, -1.91052790e+00, -7.86750104e-01,
3.78258457e-01, 4.39593652e-01, -2.52363189e-01]]],
[[[ 1.37491656e+00, 4.14083479e-01, 1.57700423e-01,
1.13358678e+00, 5.19422856e-01, 1.27082329e+00],
[ 5.23277884e-01, -1.88789860e+00, 2.14678064e-01,
-1.06686611e+00, 3.30591333e-01, 3.35493178e-01],
[-1.59169189e+00, 1.57069283e-01, -1.36289299e+00,
-1.11746383e+00, -3.11081649e-01, 5.61100542e-02]],
[[ 1.28080013e-01, 6.27032443e-01, 1.79467114e+00,
-1.57764026e-01, 2.85818555e-02, 1.88858364e-01],
[ 1.11731485e+00, 5.98713700e-01, 1.74348920e+00,
1.16849350e+00, 4.96488519e-01, -9.03236350e-01],
[-2.03886998e+00, 1.77337731e+00, -2.47554042e-01,
-1.34062951e-01, 6.77553690e-01, -8.51806299e-01]],
[[ 1.65154731e-01, 1.58660420e+00, -5.96123417e-01,
4.14781612e-01, -2.86112082e-01, -4.58491723e-01],
[ 2.10942628e-01, 4.49036485e-01, -5.81807008e-01,
-8.88261103e-01, 1.10540004e+00, -1.25490150e-01],
[ 1.00156719e+00, 1.43448123e+00, 5.26386450e-01,
1.07722843e-01, 9.33027244e-01, 2.94267852e-01]],
[[-8.86839814e-01, 1.71618150e+00, 9.59817415e-02,
-8.11726469e-01, 1.58823091e+00, 7.83397973e-01],
[-6.11941171e-01, -2.15122257e-01, 1.35006738e+00,
9.02740628e-01, 1.78067992e+00, 5.92182215e-01],
[-5.49519655e-02, -2.28827718e+00, -3.04005850e-02,
1.74072099e-02, -3.10340198e-01, 9.40272954e-01]],
[[ 2.14724894e-01, 5.78167971e-02, 1.75518357e-01,
1.63220992e+00, 1.43465118e+00, 3.25344887e-01],
[ 5.26788696e-01, -4.05292566e-01, -5.76770605e-01,
1.01405120e+00, -5.77664768e-01, -6.61999762e-01],
[ 1.33467234e+00, -1.12586169e-01, 6.19579892e-01,
1.54839041e+00, -4.48624740e-01, 7.62879741e-01]]]])
9)np.random.normal(loc=0.0, scale=1.0, size=None)
# 正态分布
np.random.normal(loc=10, scale=3, size=(10,10))
array([[ 5.39542569, 8.02174761, 10.7922298 , 9.2532619 , 3.54380751,
10.13521091, 8.38260486, 6.46633704, 7.34526588, 8.47834818],
[17.5342105 , 13.64568401, 5.53525061, 4.72069144, 12.18728018,
11.83398128, 11.68685692, 11.83339777, 8.96921638, 7.88365318],
[13.97898041, 15.51185662, 10.45825015, 9.01183143, 7.22653811,
13.81526409, 7.98361056, 11.16795462, 10.49929 , 12.36235959],
[14.16381173, 12.9396122 , 8.68977235, 10.9182213 , 10.3120272 ,
6.8344999 , 11.6365478 , 7.93762906, 8.44988529, 8.63018719],
[ 5.84298216, 7.6348048 , 8.87944357, 12.84379245, 9.34660486,
11.78931125, 7.97930487, 8.84437005, 10.94180668, 10.09415282],
[ 9.7141852 , 7.20038966, 12.70328782, 9.70684455, 10.26733188,
8.07187687, 7.93802441, 11.7795841 , 5.53032143, 6.41289191],
[14.8277533 , 7.68244359, 7.18880075, 15.60258849, 6.53135817,
8.04664035, 8.63569145, 11.64859642, 14.62059213, 9.22200238],
[ 9.3758629 , 7.01491878, 5.57777986, 10.29400291, 6.30047501,
12.41720938, 6.55191212, 11.78087737, 10.64869112, 16.65177931],
[ 9.33821451, 8.65681327, 13.70652152, 11.40453116, 10.1199998 ,
9.83212157, 8.49217612, 7.53348624, 11.87535779, 11.70440834],
[ 9.99855802, 14.84375394, 7.1265845 , 14.09313878, 9.28822027,
10.36981941, 12.23054785, 10.02317439, 6.94238958, 7.11836409]])
- np.random.random(size=None)
生成0到1的随机数,左闭右开
np.random.random(size=(4,5,6))
array([[[0.99217627, 0.26793299, 0.99394364, 0.22054 , 0.32580602,
0.30809971],
[0.33071785, 0.95429916, 0.01986428, 0.17855457, 0.21304299,
0.86083271],
[0.89792383, 0.32497107, 0.98978694, 0.08083149, 0.24647395,
0.34451748],
[0.88247103, 0.98851495, 0.89866648, 0.9657081 , 0.25398618,
0.13208125],
[0.41249406, 0.00770894, 0.70389634, 0.5631016 , 0.42096299,
0.01012447]],
[[0.61508379, 0.23403205, 0.77943065, 0.92925391, 0.82884435,
0.04831943],
[0.7920922 , 0.52853936, 0.93270148, 0.68324617, 0.09686606,
0.47118347],
[0.31923095, 0.19785272, 0.24865662, 0.51980656, 0.94891209,
0.90589559],
[0.54273565, 0.16386834, 0.32463571, 0.55039185, 0.97763211,
0.38729463],
[0.11051292, 0.76734635, 0.58407042, 0.14009032, 0.8001273 ,
0.56089232]],
[[0.88793242, 0.03228628, 0.41404253, 0.11743264, 0.07310751,
0.41365707],
[0.18977164, 0.43740961, 0.86130624, 0.13060349, 0.25103944,
0.29168853],
[0.36567411, 0.84418391, 0.25043585, 0.99249934, 0.37624629,
0.30919204],
[0.76225959, 0.38748963, 0.1118659 , 0.4372513 , 0.0449969 ,
0.71026372],
[0.87326398, 0.3929973 , 0.05067703, 0.90708559, 0.62357069,
0.61490996]],
[[0.22115718, 0.73752451, 0.14201089, 0.37996379, 0.30722891,
0.48798515],
[0.99082144, 0.98968165, 0.08893685, 0.67823797, 0.55184288,
0.95447591],
[0.85650586, 0.57701979, 0.13420377, 0.39845529, 0.38013019,
0.51578965],
[0.03731075, 0.45582031, 0.87441774, 0.06283706, 0.09403936,
0.59231378],
[0.71727284, 0.07507806, 0.82967436, 0.97591238, 0.5106871 ,
0.97020674]]])
# 和np.random.random一样
np.random.rand(4,5,6)
array([[[0.84920815, 0.06115048, 0.52784074, 0.11631164, 0.0794115 ,
0.75116528],
[0.75402408, 0.48729073, 0.13375631, 0.86670798, 0.96985989,
0.9794771 ],
[0.57543737, 0.25564423, 0.89411921, 0.94581208, 0.17600705,
0.69545504],
[0.94338681, 0.72406681, 0.28872745, 0.68223263, 0.00306791,
0.66942894],
[0.38889156, 0.70442923, 0.28233602, 0.92436187, 0.18564818,
0.25364427]],
[[0.6616453 , 0.49884097, 0.8439916 , 0.47896449, 0.75189562,
0.26884102],
[0.40731118, 0.12380244, 0.98558809, 0.96132313, 0.30471305,
0.94259421],
[0.17892769, 0.07971804, 0.95436809, 0.01361903, 0.74291155,
0.2379787 ],
[0.5704826 , 0.14287269, 0.48120187, 0.82985308, 0.41479308,
0.86155285],
[0.3372706 , 0.55285293, 0.29899224, 0.99641843, 0.94941544,
0.63249762]],
[[0.27739159, 0.18493944, 0.78269893, 0.67627045, 0.86402007,
0.06197273],
[0.8377913 , 0.87775834, 0.33398251, 0.0789392 , 0.22013998,
0.54699295],
[0.05834604, 0.86079236, 0.88953863, 0.39126594, 0.78479682,
0.73787922],
[0.25746483, 0.88599802, 0.02801758, 0.24717032, 0.96160383,
0.01207161],
[0.43439863, 0.93908126, 0.43533969, 0.0953498 , 0.82138681,
0.69955872]],
[[0.36159885, 0.1267902 , 0.35807791, 0.23496393, 0.8358155 ,
0.49169261],
[0.36695394, 0.89923166, 0.80990249, 0.79827866, 0.32504165,
0.6796608 ],
[0.75830213, 0.70748715, 0.74876156, 0.30852763, 0.64613042,
0.62289855],
[0.50987922, 0.25487361, 0.67154354, 0.97570581, 0.15335018,
0.56862394],
[0.81221987, 0.92852551, 0.15420147, 0.9900146 , 0.8083351 ,
0.96861128]]])
二、ndarray的属性
4个必记参数:
- ndim:维度
- shape:形状(各维度的长度)
- size:总长度
- dtype:元素类型
n = np.random.randint(0,150,size=(4,6))
n
array([[127, 119, 139, 125, 100, 87],
[ 93, 34, 147, 6, 21, 11],
[ 45, 1, 111, 93, 98, 113],
[ 25, 128, 61, 58, 119, 99]])
n.ndim
2
n.shape
(4, 6)
n.size # 元素的个数
24
n.dtype
dtype('int32')
三、ndarray的基本操作
1. 索引
一维与列表完全一致
多维时同理
一维索引
l = [1,2,3,4,5]
n = np.array(l)
n
array([1, 2, 3, 4, 5])
n[0]
1
多维索引
n = np.random.randint(0,150, size=(4,5))
n
array([[ 4, 116, 57, 144, 123],
[ 73, 7, 31, 40, 4],
[ 58, 73, 119, 41, 139],
[144, 43, 97, 7, 82]])
n[0][0]
4
# 推荐写法
n[0, 0]
4
根据索引修改数据
n = np.random.randint(0,100, size=(4,5,3))
n
array([[[ 5, 5, 18],
[ 5, 69, 94],
[76, 18, 21],
[61, 93, 89],
[13, 52, 17]],
[[ 8, 77, 58],
[59, 70, 31],
[15, 29, 3],
[89, 94, 43],
[99, 96, 27]],
[[78, 73, 55],
[24, 85, 87],
[51, 38, 21],
[34, 47, 38],
[69, 34, 65]],
[[66, 76, 7],
[ 2, 25, 50],
[22, 38, 16],
[41, 22, 70],
[29, 68, 12]]])
n[2,2,1] = 8
n[2,2,1]
8
2. 切片
一维与列表完全一致
多维时同理
一维切片
l = [1,2,3,4,5,6]
n = np.array(l)
n
array([1, 2, 3, 4, 5, 6])
n[1:4]
array([2, 3, 4])
多维切片
n = np.random.randint(0,100, size=(4,6))
n
array([[20, 21, 8, 91, 43, 49],
[68, 95, 25, 67, 80, 89],
[46, 71, 18, 30, 9, 39],
[10, 89, 14, 44, 3, 27]])
n[1:3, 2:4]
array([[25, 67],
[18, 30]])
数据反转
字符串反转
s = 'abcdef'
s[::-1]
'fedcba'
二维矩阵反转
n
array([[20, 21, 8, 91, 43, 49],
[68, 95, 25, 67, 80, 89],
[46, 71, 18, 30, 9, 39],
[10, 89, 14, 44, 3, 27]])
n[::-1, ::-1]
array([[27, 3, 44, 14, 89, 10],
[39, 9, 30, 18, 71, 46],
[89, 80, 67, 25, 95, 68],
[49, 43, 91, 8, 21, 20]])
3. reshape 变形
使用reshape函数,注意参数是一个tuple!
n
array([[20, 21, 8, 91, 43, 49],
[68, 95, 25, 67, 80, 89],
[46, 71, 18, 30, 9, 39],
[10, 89, 14, 44, 3, 27]])
n.shape
(4, 6)
# reshape改变shape
n.reshape((6,4))
array([[20, 21, 8, 91],
[43, 49, 68, 95],
[25, 67, 80, 89],
[46, 71, 18, 30],
[ 9, 39, 10, 89],
[14, 44, 3, 27]])
n.reshape(3,8)
array([[20, 21, 8, 91, 43, 49, 68, 95],
[25, 67, 80, 89, 46, 71, 18, 30],
[ 9, 39, 10, 89, 14, 44, 3, 27]])
np.reshape(n, (3,8))
array([[20, 21, 8, 91, 43, 49, 68, 95],
[25, 67, 80, 89, 46, 71, 18, 30],
[ 9, 39, 10, 89, 14, 44, 3, 27]])
n.shape
(4, 6)
注意: reshape的时候,元素个数不能变
4. concatenate 级联
- np.concatenate()
级联需要注意的点:
- 级联的参数是列表:一定要加中括号或小括号
- 维度必须相同
- 形状相符
- 【重点】级联的方向默认是shape这个tuple的第一个值所代表的维度方向
- 可通过axis参数改变级联的方向
# axis 单数的轴 复数axes
# axis=0 对行进行操作即垂直级联, axis=1 对列操作, 水平级联
n1 = np.random.randint(0,100, size=(4,5))
n2 = np.random.randint(0,100, size=(4,5))
display(n1,n2)
np.concatenate((n1,n2), axis=0)
array([[65, 37, 41, 43, 80],
[77, 28, 44, 32, 14],
[25, 54, 17, 13, 25],
[31, 89, 69, 24, 5]])
array([[81, 98, 70, 3, 24],
[56, 54, 41, 5, 41],
[46, 5, 34, 37, 51],
[51, 4, 74, 69, 70]])
array([[65, 37, 41, 43, 80],
[77, 28, 44, 32, 14],
[25, 54, 17, 13, 25],
[31, 89, 69, 24, 5],
[81, 98, 70, 3, 24],
[56, 54, 41, 5, 41],
[46, 5, 34, 37, 51],
[51, 4, 74, 69, 70]])
# 水平级联
n1 = np.random.randint(0,100, size=(4,5))
n2 = np.random.randint(0,100, size=(4,5))
display(n1,n2)
np.concatenate((n1,n2), axis=1)
array([[22, 2, 5, 59, 99],
[49, 90, 57, 59, 78],
[47, 47, 41, 40, 47],
[72, 43, 5, 45, 5]])
array([[67, 27, 70, 4, 59],
[74, 86, 89, 88, 47],
[46, 13, 19, 11, 28],
[64, 83, 61, 74, 45]])
array([[22, 2, 5, 59, 99, 67, 27, 70, 4, 59],
[49, 90, 57, 59, 78, 74, 86, 89, 88, 47],
[47, 47, 41, 40, 47, 46, 13, 19, 11, 28],
[72, 43, 5, 45, 5, 64, 83, 61, 74, 45]])
- np.hstack与np.vstack
水平级联与垂直级联,自己进行维度的变更
# h=horizontal 水平
np.hstack((n1, n2))
array([[22, 2, 5, 59, 99, 67, 27, 70, 4, 59],
[49, 90, 57, 59, 78, 74, 86, 89, 88, 47],
[47, 47, 41, 40, 47, 46, 13, 19, 11, 28],
[72, 43, 5, 45, 5, 64, 83, 61, 74, 45]])
# v=vertical
np.vstack((n1,n2))
array([[22, 2, 5, 59, 99],
[49, 90, 57, 59, 78],
[47, 47, 41, 40, 47],
[72, 43, 5, 45, 5],
[67, 27, 70, 4, 59],
[74, 86, 89, 88, 47],
[46, 13, 19, 11, 28],
[64, 83, 61, 74, 45]])
n1 = np.random.randint(0,100, size=(4,3))
n2 = np.random.randint(0,100, size=(4,6))
display(n1,n2)
array([[ 7, 35, 98],
[14, 45, 43],
[ 4, 96, 29],
[17, 69, 34]])
array([[49, 96, 54, 10, 94, 69],
[68, 5, 98, 11, 17, 11],
[20, 41, 61, 90, 19, 6],
[35, 12, 81, 27, 13, 79]])
np.vstack((n1,n2))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
C:\Users\ADMINI~1\AppData\Local\Temp\1/ipykernel_5300/2786370306.py in <module>
----> 1 np.vstack((n1,n2))
<__array_function__ internals> in vstack(*args, **kwargs)
c:\program files\python39\lib\site-packages\numpy\core\shape_base.py in vstack(tup)
280 if not isinstance(arrs, list):
281 arrs = [arrs]
--> 282 return _nx.concatenate(arrs, 0)
283
284
<__array_function__ internals> in concatenate(*args, **kwargs)
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 3 and the array at index 1 has size 6
级联的基本要求: 垂直级联的时候 ,列数一定要相同, 水平级联行数一定要相同.
5. 切分
与级联类似,三个函数完成切分工作:
- np.split
- np.vsplit
- np.hsplit
n = np.random.randint(0,100, size=(6,6))
n
# 左闭右开 axis=0 ,对行进行切割
np.split(n, [2, 4], axis=0)
np.split(n, [2, 4], axis=1)
# axis=1
np.hsplit(n, [2,4])
# axis=0 对 行进行切割
np.vsplit(n, [2,4])
6. 副本
所有赋值运算不会为ndarray的任何元素创建副本,赋值后的对象的操作也对原来的对象生效。
n
n2 = n
display(n, n2)
n2[0,0] = 88
n2
n
id(n)
2382274309744
id(n2)
2382274399184
使用copy()函数创建副本
n3 = n.copy()
id(n3)
2382274565392
id(n)
2382274309744
n3[0,0] = 108
n3
array([[108, 21, 8, 91, 43, 49],
[ 68, 95, 25, 67, 80, 89],
[ 46, 71, 18, 30, 9, 39],
[ 10, 89, 14, 44, 3, 27]])
n
array([[20, 21, 8, 91, 43, 49],
[68, 95, 25, 67, 80, 89],
[46, 71, 18, 30, 9, 39],
[10, 89, 14, 44, 3, 27]])
n.view()
array([[20, 21, 8, 91, 43, 49],
[68, 95, 25, 67, 80, 89],
[46, 71, 18, 30, 9, 39],
[10, 89, 14, 44, 3, 27]])
四、ndarray的聚合操作
1. 求和np.sum
n
array([[20, 21, 8, 91, 43, 49],
[68, 95, 25, 67, 80, 89],
[46, 71, 18, 30, 9, 39],
[10, 89, 14, 44, 3, 27]])
n.sum() # axis=None 表示所有的维度都聚合成0维
1056
# axis=0 表示对行进行聚合操作,行没了,剩下列.
n.sum(axis=0)
array([144, 276, 65, 232, 135, 204])
# axis=1表示对列进行聚合,列没了,行还在.
n.sum(axis=1)
array([232, 424, 213, 187])
np.sum(n, axis=0)
array([144, 276, 65, 232, 135, 204])
2. 最大最小值:np.max/ np.min
n.max(axis=0)
array([68, 95, 25, 91, 80, 89])
n.min(axis=1)
array([ 8, 25, 9, 3])
3. 其他聚合操作
Function Name NaN-safe Version Description
np.sum np.nansum ompute sum of elements
np.prod np.nanprod Compute product of elements
np.mean np.nanmean Compute mean of elements
np.std np.nanstd Compute standard deviation
np.var np.nanvar Compute variance
np.min np.nanmin Find minimum value
np.max np.nanmax Find maximum value
np.argmin np.nanargmin Find index of minimum value
np.argmax np.nanargmax Find index of maximum value
np.median np.nanmedian Compute median of elements
np.percentile np.nanpercentile Compute rank-based statistics of elements
np.any N/A Evaluate whether any elements are true
np.all N/A Evaluate whether all elements are true
np.power 幂运算
n
array([[20, 21, 8, 91, 43, 49],
[68, 95, 25, 67, 80, 89],
[46, 71, 18, 30, 9, 39],
[10, 89, 14, 44, 3, 27]])
# 返回最小值的索引
n.argmin(axis=0)
array([3, 0, 0, 2, 3, 3], dtype=int64)
any([' ', [], (), {}])
True
all([' ', [], (), {}])
False
# 只要有True就返回True
np.any([1,2,3,4])
True
# 只要有False就会返回False
np.all([0,2,3,4])
False
np.nan 一般用来表示numpy, pandas中的空数据.
np.sum 和 np.nansum 的区别
nan not a number
n3 = np.array([1,2,3,4, np.nan])
n3.sum()
# n3.nansum() # 没有这样的函数
np.nansum(n3)
4.操作文件
使用pandas打开文件president_heights.csv
获取文件中的数据
import pandas as pd
heights = pd.read_csv('./president_heights.csv')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
C:\Users\ADMINI~1\AppData\Local\Temp\1/ipykernel_5300/1688407161.py in <module>
----> 1 heights = pd.read_csv('./president_heights.csv')
NameError: name 'pd' is not defined
# DataFrame
heights
data = heights['height(cm)'].values
# 美国历任总统平均身高
data.mean()
# 最高身高
data.max()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
C:\Users\ADMINI~1\AppData\Local\Temp\1/ipykernel_5300/1120365145.py in <module>
1 # 最高身高
----> 2 data.max()
NameError: name 'data' is not defined
data.min()
data.std()
五、ndarray的矩阵操作
1. 基本矩阵操作
- 算术运算符:
- 加减乘除
n = np.random.randint(10, 60, size=(6, 6))
n2 = np.random.randint(10, 60, size=(6, 6))
n + 1 # 每个元素分别加1
array([[21, 22, 9, 92, 44, 50],
[69, 96, 26, 68, 81, 90],
[47, 72, 19, 31, 10, 40],
[11, 90, 15, 45, 4, 28]])
n + n2 # 对应的元素进行运算.
array([[ 69, 117, 62, 101, 137, 118],
[136, 100, 123, 78, 97, 100],
[ 66, 112, 79, 120, 28, 45],
[ 45, 101, 95, 71, 16, 106]])
n1 = np.random.randint(0,10, size=(4,5))
n2 = np.random.randint(0,10, size=(5,4))
n1 + n2
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
C:\Users\ADMINI~1\AppData\Local\Temp\1/ipykernel_5300/3564642266.py in <module>
2 n2 = np.random.randint(0,10, size=(5,4))
3
----> 4 n1 + n2
ValueError: operands could not be broadcast together with shapes (4,5) (5,4)
- 矩阵积np.dot()
n1 = np.random.randint(10, 60, size=(6, 6))
n2 = np.random.randint(10, 60, size=(6, 6))
n1 * n2 # 普通乘法,对应元素相乘
矩阵的积,矩阵的点,要求第一个矩阵的列数和第二个矩阵的行数要相同才行。矩阵的积是有顺序的,不满足乘法的交换律。
a * b = b * a
如果是矩阵,交换律不成立.
n1 = np.random.randint(0,10, size=(4,5))
n2 = np.random.randint(0,10, size=(5, 6))
np.dot(n1, n2)
2. 广播机制
当运算的ndarray的shape不一致的时候,numpy就会启动广播机制,目的就是为了让运算的两个ndarray的shape变成一样。
【重要】ndarray广播机制的两条规则
- 规则一:为缺失的维度补1
- 规则二:缺失元素用已有值填充
例1:m = np.ones((2, 3)),a = np.arange(3),求m+a
m = np.ones((2, 3))
a = np.arange(3)
display(m, a)
array([[1., 1., 1.],
[1., 1., 1.]])
array([0, 1, 2])
m + a
array([[1., 2., 3.],
[1., 2., 3.]])
例2:
a = np.arange(3).reshape((3, 1)),b = np.arange(3),求a+b
a = np.arange(3).reshape((3, 1))
a
array([[0],
[1],
[2]])
b = np.arange(3)
b
array([0, 1, 2])
a + b
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4]])
习题:a = np.ones((4, 1)),b = np.arange(4),求a+b
a = np.ones((4, 1))
a
array([[1.],
[1.],
[1.],
[1.]])
b = np.arange(4)
b
array([0, 1, 2, 3])
a + b
array([[1., 2., 3., 4.],
[1., 2., 3., 4.],
[1., 2., 3., 4.],
[1., 2., 3., 4.]])
六、ndarray的排序
1. 快速排序
np.sort()与ndarray.sort()都可以,但有区别:
- np.sort():不改变被排序的矩阵
- ndarray.sort():本地处理,不占用空间,但改变被排序的矩阵
n = np.array([6,2,1,4,3])
np.sort(n)
array([1, 2, 3, 4, 6])
n
array([6, 2, 1, 4, 3])
n.sort()# 会修改原始数据
n
array([1, 2, 3, 4, 6])
2. 部分排序
n = np.random.randint(0,100, size=100)
n
array([32, 88, 91, 91, 47, 8, 44, 23, 67, 67, 98, 55, 95, 63, 72, 24, 70,
8, 13, 10, 53, 25, 10, 56, 74, 89, 91, 48, 88, 48, 42, 46, 86, 80,
56, 17, 91, 90, 30, 7, 81, 92, 66, 52, 50, 7, 32, 90, 90, 81, 16,
69, 12, 18, 97, 47, 26, 2, 89, 84, 83, 70, 19, 53, 0, 78, 58, 0,
70, 82, 59, 7, 58, 75, 10, 94, 9, 87, 30, 97, 53, 50, 65, 37, 13,
57, 29, 33, 20, 73, 62, 98, 16, 90, 26, 74, 77, 74, 79, 81])
# 最大的5个
np.partition(n, kth=-5)
array([75, 32, 79, 74, 47, 8, 44, 23, 67, 67, 77, 55, 74, 63, 72, 24, 70,
8, 13, 10, 53, 25, 10, 56, 74, 26, 16, 48, 62, 48, 42, 46, 73, 80,
56, 17, 20, 33, 30, 7, 29, 57, 66, 52, 50, 7, 32, 13, 37, 65, 16,
69, 12, 18, 50, 47, 26, 2, 53, 30, 9, 70, 19, 53, 0, 78, 58, 0,
70, 10, 59, 7, 58, 81, 81, 81, 82, 83, 87, 84, 89, 89, 88, 90, 88,
86, 90, 91, 90, 91, 90, 91, 91, 92, 94, 95, 97, 98, 97, 98])
# 最小的5个.
np.partition(n, kth=5)
array([ 0, 0, 2, 7, 7, 7, 8, 12, 8, 13, 10, 16, 10, 10, 9, 13, 16,
30, 24, 19, 53, 25, 53, 56, 53, 20, 55, 48, 58, 48, 42, 46, 26, 47,
56, 17, 32, 33, 30, 23, 29, 57, 50, 52, 50, 44, 32, 47, 37, 18, 26,
58, 69, 65, 66, 80, 73, 62, 74, 70, 72, 70, 63, 74, 77, 78, 67, 67,
70, 74, 59, 79, 75, 81, 82, 94, 83, 87, 84, 97, 89, 97, 88, 90, 90,
92, 81, 90, 91, 86, 88, 98, 91, 90, 89, 95, 98, 91, 91, 81])