2018/7/7 Numpy学习笔记

Numpy基础:数组和矢量计算

创建N维数组

创建数组最简单的办法就是使用array函数。

import numpy as np
data1=[6,22,3.3,2]
arr1=np.array(data1)
data2=[[1,2,3,4],[5,6,7,8]]
In [8]: arr1
Out[8]: array([  6. ,  22. ,   3.3,   2. ])
In [9]: arr2
Out[9]:array([[1, 2, 3, 4],
       [5, 6, 7, 8]])          //嵌套序列将会被转换成为多维数组
并且,除非显示说明,np.array将会为新建的数组推断出一个较为合适的数据类型。保存在dtype对象中。
In [11]: arr1.dtype
Out[11]: dtype('float64')
In [12]: arr2.dtype
Out[12]: dtype('int32')
另外,还可以用np.zeros或np.ones创建全0或全1数组,empty可以创建一个没有任何具体值的数组。
In [13]: np.zeros(5)
Out[13]: array([ 0.,  0.,  0.,  0.,  0.])
In [15]: np.ones((3,3))
Out[15]:
array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])
In [16]: np.empty((2,3,2))
Out[16]:
array([[[ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.]],

       [[ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.]]])
可以用astype显式地转换其dtype:
In [17]: arr1.dtype
Out[17]: dtype('float64')

In [18]: int_arr1=arr1.astype(np.int64)

In [19]: int_arr1
Out[19]: array([ 6, 22,  3,  2], dtype=int64)  //之前的小数部分会被截断
astype也可以将字符串数组转换为数值数组,代码略。


数组与标量之间的运算

数组可以使我们不用编写循环即可对数据执行批量运算,这通常叫做矢量化(vectorization)。

In [20]: arr2
Out[20]:
array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [21]: arr2 * arr2
Out[21]:
array([[ 1,  4,  9, 16],
       [25, 36, 49, 64]])

In [22]: arr2 +1
Out[22]:
array([[2, 3, 4, 5],
       [6, 7, 8, 9]])

In [23]: 1 / arr2
Out[23]:
array([[ 1.        ,  0.5       ,  0.33333333,  0.25      ],
       [ 0.2       ,  0.16666667,  0.14285714,  0.125     ]])

In [24]: arr2 ** 0.5
Out[24]:
array([[ 1.        ,  1.41421356,  1.73205081,  2.        ],
       [ 2.23606798,  2.44948974,  2.64575131,  2.82842712]])

基本的索引和切片

索引比较简单,和python本身的索引差不多。

切片

In [25]: arr=np.arange(10)

In [26]: arr
Out[26]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [27]: arr[5:8]=12

In [28]: arr
Out[28]: array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

数组切片是原始数组的视图,也就是说一旦操作,源数据直接改变。

In [29]: arr_slice=arr[5:8]

In [30]: arr_slice[1]=111

In [31]: arr
Out[31]: array([  0,   1,   2,   3,   4,  12, 111,  12,   8,   9])

In [32]: arr_slice[:]=4444

In [33]: arr
Out[33]: array([   0,    1,    2,    3,    4, 4444, 4444, 4444,    8,    9])

如果想要的是副本而非视图,那就要进行复制操作,例如:

arr[5:8].copy()

多维数组的索引,比如二维数组

In [34]: arr2d=np.array([[1,2,3],[4,5,6],[7,8,9]])

In [35]: arr2d[2]
Out[35]: array([7, 8, 9])

In [36]: arr2d[0][2]
Out[36]: 3

In [37]: arr2d[0,2]
Out[37]: 3

三维数组

In [38]: arr3d=np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])

In [39]: arr3d[0]
Out[39]:
array([[1, 2, 3],
       [4, 5, 6]])

标量值和数组都可以被赋值给arr3d[0]

In [40]: old_values=arr3d[0].copy()

In [41]: arr3d[0]=45

In [42]: arr3d
Out[42]:
array([[[45, 45, 45],
        [45, 45, 45]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [43]: arr3d[0]=old_values

In [44]: arr3d
Out[44]:
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

切片索引,高维度对象可以在一个或多个轴上进行切片,也可以跟整数索引一起使用。

In [45]: arr2d
Out[45]:
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [46]: arr2d[:2]
Out[46]:
array([[1, 2, 3],
       [4, 5, 6]])

In [47]: arr2d[:2,1:]
Out[47]:
array([[2, 3],
       [5, 6]])

In [48]: arr2d[1,:2]
Out[48]: array([4, 5])

In [49]: arr2d[2,:1]
Out[49]: array([7])

In [50]: arr2d[:,:1]        只有冒号表示选取整个轴
Out[50]:
array([[1],
       [4],
       [7]])

In [51]: arr2d[:2,1:]=0     对切片表达式的赋值

In [52]: arr2d
Out[52]:
array([[1, 0, 0],
       [4, 0, 0],
       [7, 8, 9]])

布尔型索引

In [54]: data=np.random.randn(5,5)

In [55]: data
Out[55]:
array([[ 0.26224992,  0.97018499,  0.22580213, -1.21175716, -1.41655148],
       [-0.91801291,  0.9588066 , -1.4228044 , -0.93916245,  0.50487793],
       [ 1.26572253, -0.31677449, -0.04173863,  0.28175939,  0.36777067],
       [-0.85381682,  0.39739235,  0.23002012, -0.08400604, -0.61019238],
       [-0.06159692, -0.67428044,  0.2520452 , -0.52615204, -0.26562721]])

In [56]: data[data<0]=0

In [57]: data
Out[57]:
array([[ 0.26224992,  0.97018499,  0.22580213,  0.        ,  0.        ],
       [ 0.        ,  0.9588066 ,  0.        ,  0.        ,  0.50487793],
       [ 1.26572253,  0.        ,  0.        ,  0.28175939,  0.36777067],
       [ 0.        ,  0.39739235,  0.23002012,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.2520452 ,  0.        ,  0.        ]])
In [58]: names = np.array(['Bob','Joe','Will','Bob','Will','Joe','Joe'])

In [59]: names == 'Bob'
Out[59]: array([ True, False, False,  True, False, False, False], dtype=bool)
In [61]: data[names == 'Bob']
Out[61]:
array([[ 0.26224992,  0.97018499,  0.22580213,  0.        ,  0.        ],
       [ 0.        ,  0.39739235,  0.23002012,  0.        ,  0.        ]])

花式索引

索引值为行数

In [63]: arr = np.empty((8,4))

In [64]: for i in range(8):
    ...:     arr[i]=i
    ...:

In [65]: arr
Out[65]:
array([[ 0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.],
       [ 4.,  4.,  4.,  4.],
       [ 5.,  5.,  5.,  5.],
       [ 6.,  6.,  6.,  6.],
       [ 7.,  7.,  7.,  7.]])

In [66]: arr[[4,3,0,4]]
Out[66]:
array([[ 4.,  4.,  4.,  4.],
       [ 3.,  3.,  3.,  3.],
       [ 0.,  0.,  0.,  0.],
       [ 4.,  4.,  4.,  4.]])

In [67]: arr[[-3,-4,-1,-7]]
Out[67]:
array([[ 5.,  5.,  5.,  5.],
       [ 4.,  4.,  4.,  4.],
       [ 7.,  7.,  7.,  7.],
       [ 1.,  1.,  1.,  1.]])
In [68]: arr = np.arange(32).reshape((8,4))

In [69]: arr
Out[69]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

In [70]: arr[[1,4,5,5],[2,2,1,0]]  索引值坐标为(1,2)(4,2)(5,1)(5,0)
Out[70]: array([ 6, 18, 21, 20])

数组转置和轴对换

In [71]: arr
Out[71]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

In [72]: arr.T     数组的简单转置 
Out[72]:
array([[ 0,  4,  8, 12, 16, 20, 24, 28],
       [ 1,  5,  9, 13, 17, 21, 25, 29],
       [ 2,  6, 10, 14, 18, 22, 26, 30],
       [ 3,  7, 11, 15, 19, 23, 27, 31]])

In [73]: np.dot(arr.T,arr)
Out[73]:
array([[2240, 2352, 2464, 2576],
       [2352, 2472, 2592, 2712],
       [2464, 2592, 2720, 2848],
       [2576, 2712, 2848, 2984]])

In [74]: np.dot(arr,arr.T)
Out[74]:
array([[  14,   38,   62,   86,  110,  134,  158,  182],
       [  38,  126,  214,  302,  390,  478,  566,  654],
       [  62,  214,  366,  518,  670,  822,  974, 1126],
       [  86,  302,  518,  734,  950, 1166, 1382, 1598],
       [ 110,  390,  670,  950, 1230, 1510, 1790, 2070],
       [ 134,  478,  822, 1166, 1510, 1854, 2198, 2542],
       [ 158,  566,  974, 1382, 1790, 2198, 2606, 3014],
       [ 182,  654, 1126, 1598, 2070, 2542, 3014, 3486]])

对于高维数组,transpose需要得到一个由轴编号组成的元组才能对这些轴进行转置。

In [76]: arr = np.arange(12).reshape((2,2,3))

In [77]: arr
Out[77]:
array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]]])

In [78]: arr.transpose((1,0,2))
Out[78]:
array([[[ 0,  1,  2],
        [ 6,  7,  8]],

       [[ 3,  4,  5],
        [ 9, 10, 11]]])
In [79]: arr.swapaxes(1,2)
Out[79]:
array([[[ 0,  3],
        [ 1,  4],
        [ 2,  5]],

       [[ 6,  9],
        [ 7, 10],
        [ 8, 11]]])

通用函数:快速的元素级数组函数

In [80]: arr = np.arange(10)

In [81]: np.sqrt(arr)
Out[81]:
array([ 0.        ,  1.        ,  1.41421356,  1.73205081,  2.        ,
        2.23606798,  2.44948974,  2.64575131,  2.82842712,  3.        ])

In [82]: np.exp(arr)
Out[82]:
array([  1.00000000e+00,   2.71828183e+00,   7.38905610e+00,
         2.00855369e+01,   5.45981500e+01,   1.48413159e+02,
         4.03428793e+02,   1.09663316e+03,   2.98095799e+03,
         8.10308393e+03])

In [83]: x = np.random.randn(8)

In [84]: y = np.random.randn(8)

In [85]: x
Out[85]:
array([-0.64050216,  0.4058439 ,  0.53655964, -0.76862822, -0.16882124,
        0.52559669,  0.38989637, -0.43821311])

In [86]: y
Out[86]:
array([-0.18182022,  1.74568738,  0.70178628, -1.01851544,  0.73568589,
       -0.2059226 , -0.16270816,  1.057713  ])

In [87]: np.maximum(x,y)  求两个数组中相同坐标下的最大值
Out[87]:
array([-0.18182022,  1.74568738,  0.70178628, -0.76862822,  0.73568589,
        0.52559669,  0.38989637,  1.057713  ])

利用数组进行数据处理

In [93]: points = np.arange(-5,5,0.01)  以0.01为间隔 -5,5为区间 定义点  

In [94]: xs,ys = np.meshgrid(points,points)  meshgrid函数可以将一维数组扩展成二维数组

In [95]: xs
Out[95]:
array([[-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       ...,
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99]])

In [96]: import matplotlib.pyplot as plt

In [97]: z = np.sqrt(xs ** 2 + ys **2)

In [98]: z
Out[98]:
array([[ 7.07106781,  7.06400028,  7.05693985, ...,  7.04988652,
         7.05693985,  7.06400028],
       [ 7.06400028,  7.05692568,  7.04985815, ...,  7.04279774,
         7.04985815,  7.05692568],
       [ 7.05693985,  7.04985815,  7.04278354, ...,  7.03571603,
         7.04278354,  7.04985815],
       ...,
       [ 7.04988652,  7.04279774,  7.03571603, ...,  7.0286414 ,
         7.03571603,  7.04279774],
       [ 7.05693985,  7.04985815,  7.04278354, ...,  7.03571603,
         7.04278354,  7.04985815],
       [ 7.06400028,  7.05692568,  7.04985815, ...,  7.04279774,
         7.04985815,  7.05692568]])

In [99]: plt.imshow(z,cmap=plt.cm.gray); plt.colorbar()

Out[99]: <matplotlib.colorbar.Colorbar at 0x28d7dab65c0>
In [101]: plt.title('Image plot of $\sqrt{x^2 + y^2}$ for a grid of values')
Out[101]: <matplotlib.text.Text at 0x28d7d26eeb8>

In [102]: plt.show()

将条件逻辑表述为数组运算

In [103]: xarr = np.array([1.1,1.2,1.3,1.4,1.5])

In [104]: yarr = np.array([2.1,2.2,2.3,2.4,2.5])

In [105]: cond = np.array([True,False,True,True,False])

np.where函数,当cond中的值为T,选xarr;为F,选yarr。

In [106]: result = np.where(cond,xarr,yarr)

In [107]: result
Out[107]: array([ 1.1,  2.2,  1.3,  1.4,  2.5])

np.where中的第二个和第三个参数不必是数组,也可以是标量值。

In [108]: arr = np.random.randn(4,4)

In [109]: arr
Out[109]:
array([[-0.43521077,  1.41782551, -0.97362101,  1.08447685],
       [ 2.68892549, -1.30362208, -1.08288557,  0.35985212],
       [ 1.10480412, -0.80542523,  0.48892358, -1.07925725],
       [-1.34552789,  1.132726  , -2.3198594 , -0.51442034]])

In [110]: np.where(arr>0,2,-2)
Out[110]:
array([[-2,  2, -2,  2],
       [ 2, -2, -2,  2],
       [ 2, -2,  2, -2],
       [-2,  2, -2, -2]])
In [111]: np.where(arr>0,2,arr)
Out[111]:
array([[-0.43521077,  2.        , -0.97362101,  2.        ],
       [ 2.        , -1.30362208, -1.08288557,  2.        ],
       [ 2.        , -0.80542523,  2.        , -1.07925725],
       [-1.34552789,  2.        , -2.3198594 , -0.51442034]])

还可以用where表示出更复杂的逻辑,嵌套


用于布尔型数组的方法

In [112]: arr = np.random.randn(100)

In [113]: (arr > 0).sum()   数组中大于0的个数
Out[113]: 54

In [114]: bools = np.array([False,False,True,False])

In [115]: bools.any()  一个或多个True
Out[115]: True

In [116]: bools.all()  是否都为True
Out[116]: False

排序

In [118]: arr = np.random.randn(8)

In [119]: arr
Out[119]:
array([-0.50050381, -0.47721016, -0.30869937, -1.43030168,  0.00459887,
       -1.65491773,  1.22161368, -0.77993317])

In [120]: arr.sort()  从小到大排序

In [121]: arr
Out[121]:
array([-1.65491773, -1.43030168, -0.77993317, -0.50050381, -0.47721016,
       -0.30869937,  0.00459887,  1.22161368])

In [122]: arr = np.random.randn(5,3)

In [123]: arr
Out[123]:
array([[ 2.57065162, -0.96012742, -1.11512802],
       [-0.89160886,  0.06382505, -0.8871275 ],
       [-0.71819144, -1.57579496, -0.27975377],
       [ 0.09348711, -0.01115059,  0.18504493],
       [-0.75319897,  0.46313174, -2.02176903]])
In [125]: arr.sort(1)  指定排序的轴

In [126]: arr
Out[126]:
array([[-1.11512802, -0.96012742,  2.57065162],
       [-0.89160886, -0.8871275 ,  0.06382505],
       [-1.57579496, -0.71819144, -0.27975377],
       [-0.01115059,  0.09348711,  0.18504493],
       [-2.02176903, -0.75319897,  0.46313174]])

唯一化以及其他的集合逻辑

In [127]: ints = np.array([3,3,3,3,2,2,1,1,4])

In [128]: np.unique(ints)  唯一元素,并返回有序结果
Out[128]: array([1, 2, 3, 4])



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值