前面有一篇介绍了numpy的基础,这里将介绍numpy的一些高级应用。其实也包括一些常用的操作。
先看一下numpy中 最重要的对象 ndarray 的内部组成:
1)一个指向数组的指针 2)数据类型(dtype) 3) 表示数组形状的元组(shape) 4)一个跨度元组(stride), 指的是当前元素与当前维度的下一个元素之间的字节数 In [5]: np.ones((3,4,5),dtype=np.float64).strides Out[5]: (160, 40, 8) # 第一维: 8 * 4 * 5 = 160 第二维:8 * 5 = 40 第三维: 8 float64 :8个字节
numpy中的数据类型体系:
dtype之间也有继承关系,用 np.issubdtype 可以判断某一个dtype是否是某一超类的子类,也可以用dtype的mro方法查看其所有的父类:
In [6]: ints = np.ones(8,dtype=np.uint16) In [7]: floats = np.ones(8,dtype=np.float32) In [8]: np.issubdtype(ints.dtype,np.integer) Out[8]: True In [9]: np.issubdtype(floats.dtype,np.floating) Out[9]: True In [10]: np.float64.mro() Out[10]: [numpy.float64, numpy.floating, numpy.inexact, numpy.number, numpy.generic, float, object]
重塑数组
reshape前面已经说过,这里就看一下与他操作相反的运算,扁平化(flattening)或散开(raveling):
ravel 和 flatten 两者所要实现的功能是一致的(将多维数组降位一维), 区别在于numpy.flatten()返回原始数据的一份拷贝,对拷贝所做的修改不会影响原始矩阵, 而numpy.ravel()返回的是视图,对其修改会影响原始矩阵。
在重塑或扁平化的过程中,我们可以指定数组按何种顺序重塑:行顺序(C)还是列顺序(Fortran) arr = np.arange(12).reshape((3, 4), order=?) arr.ravel('F')
In [19]: arr = np.arange(12).reshape((3, 4)) In [22]: arr_ravel = arr.ravel() In [23]: arr_ravel Out[23]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) In [24]: arr_ravel[0] = 100 In [25]: arr Out[25]: array([[100, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) In [26]: arr_flatten = arr.flatten() In [27]: arr_flatten[1] = 110 In [28]: arr Out[28]: array([[100, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) In [29]: arr_flatten Out[29]: array([100, 110, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
数组的合并与拆分:
numpy.concatenate,指定参数axis=0 或者 axis=1,在纵轴和横轴上合并两个数组。
In [31]: arr1 = np.ones(12).reshape((3,4)) In [32]: arr1 Out[32]: array([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]]) In [33]: arr2 = np.random.randn(12).reshape(arr1.shape) In [34]: arr2 Out[34]: array([[-0.93824641, 0.77716465, -0.04041834, 0.63624918], [-0.59330384, -0.81196015, -0.03286048, 0.42626197], [-1.19563235, 0.3565425 , 1.78502194, 0.96864218]]) In [35]: np.concatenate([arr1,arr2],axis=0) # 在纵轴上合并数组 等同于 np.vstack([arr1,arr2]) = row_stack Out[35]: array([[ 1. , 1. , 1. , 1. ], [ 1. , 1. , 1. , 1. ], [ 1. , 1. , 1. , 1. ], [-0.93824641, 0.77716465, -0.04041834, 0.63624918], [-0.59330384, -0.81196015, -0.03286048, 0.42626197], [-1.19563235, 0.3565425 , 1.78502194, 0.96864218]]) In [36]: np.concatenate([arr1,arr2],axis=1) # 在横轴上合并数组 等同于 np.hstack([arr1,arr2]) Out[36]: array([[ 1. , 1. , 1. , 1. , -0.93824641, 0.77716465, -0.04041834, 0.63624918], [ 1. , 1. , 1. , 1. , -0.59330384, -0.81196015, -0.03286048, 0.42626197], [ 1. , 1. , 1. , 1. , -1.19563235, 0.3565425 , 1.78502194, 0.96864218]])
column_stack 类似于 hstack 只是会先将一维数组转换为二维列向量
拆分数组:
split用于将数组沿指定轴拆分为多个数组,便捷化 操作:hsplit(沿轴0),vsplit(沿轴1),dstack(沿轴2)
In [44]: arr = np.arange(40).reshape((10,4)) In [45]: arr Out[45]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31], [32, 33, 34, 35], [36, 37, 38, 39]]) In [47]: np.split(arr,5) # 要均分不然报错 Out[47]: [array([[0, 1, 2, 3], [4, 5, 6, 7]]), array([[ 8, 9, 10, 11], [12, 13, 14, 15]]), array([[16, 17, 18, 19], [20, 21, 22, 23]]), array([[24, 25, 26, 27], [28, 29, 30, 31]]), array([[32, 33, 34, 35], [36, 37, 38, 39]])] In [48]: np.split(arr,[2,5,8]) # 包含左 不包含右 Out[48]: [array([[0, 1, 2, 3], [4, 5, 6, 7]]), array([[ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]]), array([[20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]]), array([[32, 33, 34, 35], [36, 37, 38, 39]])] In [50]: np.vsplit(arr,2) Out[50]: [array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]]), array([[20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31], [32, 33, 34, 35], [36, 37, 38, 39]])] In [51]: np.hsplit(arr,2) Out[51]: [array([[ 0, 1], [ 4, 5], [ 8, 9], [12, 13], [16, 17], [20, 21], [24, 25], [28, 29], [32, 33], [36, 37]]), array([[ 2, 3], [ 6, 7], [10, 11], [14, 15], [18, 19], [22, 23], [26, 27], [30, 31], [34, 35], [38, 39]])]
numpy中的 r_ 和 c_。
In [53]: arr = np.arange(12).reshape((3,4)) In [54]: arr1 = np.random.randn(3,4) In [55]: np.r_[arr,arr1] Out[55]: array([[ 0. , 1. , 2. , 3. ], [ 4. , 5. , 6. , 7. ], [ 8. , 9. , 10. , 11. ], [-0.13844049, -1.37260966, 0.75079996, 0.9258705 ], [-0.31965182, -1.72714649, -0.23686849, 1.21305948], [ 0.70673488, -0.52744984, -0.32275471, 0.28726788]]) In [56]: np.c_[arr,arr1] Out[56]: array([[ 0. , 1. , 2. , 3. , -0.13844049, -1.37260966, 0.75079996, 0.9258705 ], [ 4. , 5. , 6. , 7. , -0.31965182, -1.72714649, -0.23686849, 1.21305948], [ 8. , 9. , 10. , 11. , 0.70673488, -0.52744984, -0.32275471, 0.28726788]])
重复元素:tile 和 repeat
repeat:各元素重复
In [63]: arr1 = np.arange(4) In [64]: arr1 Out[64]: array([0, 1, 2, 3]) In [65]: arr1.repeat(2) Out[65]: array([0, 0, 1, 1, 2, 2, 3, 3]) In [66]: arr1.repeat([1,2,3,4]) Out[66]: array([0, 1, 1, 2, 2, 2, 3, 3, 3, 3]) In [67]: arr = np.random.randn(2,2) In [68]: arr.repeat(2,axis=0) Out[68]: array([[ 2.12601262, 1.02939747], [ 2.12601262, 1.02939747], [ 1.01998255, -0.95367063], [ 1.01998255, -0.95367063]]) In [69]: arr.repeat(2,axis=1) Out[69]: array([[ 2.12601262, 2.12601262, 1.02939747, 1.02939747], [ 1.01998255, 1.01998255, -0.95367063, -0.95367063]]) In [70]: arr.repeat([2,3],axis=1) Out[70]: array([[ 2.12601262, 2.12601262, 1.02939747, 1.02939747, 1.02939747], [ 1.01998255, 1.01998255, -0.95367063, -0.95367063, -0.95367063]])
tile: 整体重复
In [71]: np.tile(arr,2) Out[71]: array([[ 2.12601262, 1.02939747, 2.12601262, 1.02939747], [ 1.01998255, -0.95367063, 1.01998255, -0.95367063]]) In [72]: arr Out[72]: array([[ 2.12601262, 1.02939747], [ 1.01998255, -0.95367063]]) In [74]: np.tile(arr,(3,2)) Out[74]: array([[ 2.12601262, 1.02939747, 2.12601262, 1.02939747], [ 1.01998255, -0.95367063, 1.01998255, -0.95367063], [ 2.12601262, 1.02939747, 2.12601262, 1.02939747], [ 1.01998255, -0.95367063, 1.01998255, -0.95367063], [ 2.12601262, 1.02939747, 2.12601262, 1.02939747], [ 1.01998255, -0.95367063, 1.01998255, -0.95367063]])
numpy中的 take 和 put 等价于花式索引:
numpy.take(array, indices, axis = None, out = None, mode =’raise’)
Return elememts from array along the mentioned axis and indices.
Parameters :
array : array_like, input array
indices : index of the values to be fetched
axis : [int, optional] axis over which we need to fetch the elements;
By Default[axis = None], flattened input is used
mode : [{‘raise’, ‘wrap’, ‘clip’}, optional] mentions how out-of-bound indices will behave
raise : [default]raise an error
wrap : wrap around
clip : clip to the range
out : [ndarray, optional]to place result within array
In [80]: arr = np.arange(10).reshape(2, 5) In [81]: arr Out[81]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [82]: np.take(arr,[0,4]) Out[82]: array([0, 4]) In [83]: np.take(arr,[0,4],axis=1) Out[83]: array([[0, 4], [5, 9]])
numpy.put(array, indices, p_array, mode = ‘raise’) Replaces specific elements of an array with given values of p_array. Array indexed works on flattened array. Parameters : array : array_like, target array indices : index of the values to be fetched p_array : array_like, values to be placed in target array mode : [{‘raise’, ‘wrap’, ‘clip’}, optional] mentions how out-of-bound indices will behave raise : [default]raise an error wrap : wrap around clip : clip to the range
In [85]: arr = np.arange(5) In [86]: arr Out[86]: array([0, 1, 2, 3, 4]) In [88]: np.put(arr, [0, 2], 888) In [89]: arr Out[89]: array([888, 1, 888, 3, 4]) In [90]: np.put(arr, [1, 3], [13,31]) In [91]: arr Out[91]: array([888, 13, 888, 31, 4])