【Numpy学习 11】排序，搜索和计数

最新推荐文章于 2021-05-11 10:15:55 发布

李博清

最新推荐文章于 2021-05-11 10:15:55 发布

阅读量167

点赞数

分类专栏： Numpy学习

本文链接：https://blog.csdn.net/weixin_44454670/article/details/112913282

版权

Numpy学习专栏收录该内容

11 篇文章 0 订阅

订阅专栏

文章目录

一、排序
二、搜索
三、计数
- 【例3-1】numpy.count_nonzero(a, axis=None)

提示：以下是本篇文章正文内容，下面案例可供参考

一、排序

【例1-1】numpy.sort(a[, axis=-1, kind=‘quicksort’, order=None])

Return a sorted copy of an array
a. axis：排序沿数组的（轴）方向，0表示按行，1表示按列，None表示展开来排序，默认为-1，表示沿最后的轴排序。
b. kind：排序的算法，提供了快排’quicksort’、混排’mergesort’、堆排’heapsort’，默认为‘quicksort’。
c. order：排序的字段名，可指定字段排序，默认为None。

x = np.random.rand(5, 5) * 10
x = np.around(x, 2)
print(x)

#array([[0.29, 6.39, 5.79, 6.27],
#       [8.81, 5.21, 2.74, 6.04],
#      [7.06, 6.98, 0.57, 0.17]])

y = np.sort(x)
print(y)
#[[0.29 5.79 6.27 6.39]
#[2.74 5.21 6.04 8.81]
#[0.17 0.57 6.98 7.06]]

y = np.sort(x,axis=0)
print(y)
#[[0.29 5.21 0.57 0.17]
# [7.06 6.39 2.74 6.04]
# [8.81 6.98 5.79 6.27]]

y = np.sort(x,axis=1)
print(y)

#[[0.29 5.79 6.27 6.39]
# [2.74 5.21 6.04 8.81]
# [0.17 0.57 6.98 7.06]]

import numpy as np

dt = np.dtype([('name', 'S10'), ('age', np.int)])
a = np.array([("Mike", 21), ("Nancy", 25), ("Bob", 17), ("Jane", 27)], dtype=dt)
b = np.sort(a, order='name')
print(b)
# [(b'Bob', 17) (b'Jane', 27) (b'Mike', 21) (b'Nancy', 25)]
b = np.sort(a, order='age')
print(b)
# [(b'Bob', 17) (b'Mike', 21) (b'Nancy', 25) (b'Jane', 27)]

【例1-2】numpy.argsort排序后索引位置替代排序后的结果

numpy.argsort(a[, axis=-1, kind=‘quicksort’, order=None]) Returns the indices that would sort an array.

对数组沿给定轴执行间接排序，并使用指定排序类型返回数据的索引数组。这个索引数组用于构造排序后的数组。

import numpy as np
np.random.seed(20200612)
x = np.random.randint(0, 10, 10)
print(x)
# [6 1 8 5 5 4 1 2 9 1]
y = np.argsort(x)
print(y)
# [1 6 9 7 5 3 4 0 2 8]
print(x[y])
# [1 1 1 2 4 5 5 6 8 9]

y = np.argsort(-x)
print(y)
# [8 2 0 3 4 5 7 1 6 9]
print(x[y])
# [9 8 6 5 5 4 2 1 1 1]

二维数组情况：

x = np.random.rand(3,4) * 10
x = np.around(x, 2)
print(x)
#array([[ 8.93,  6.53,  2.55,  9.86],
#       [10.  ,  6.22,  5.53,  5.8 ],
#       [ 4.85,  2.08,  4.96,  0.02]])

y = np.argsort(x)
print(y)
#array([[2, 1, 0, 3],
#       [2, 3, 1, 0],
#       [3, 1, 0, 2]], dtype=int64)

y = np.argsort(x,axis=0)
print(y)

y = np.argsort(x,axis=1)
print(y)

在这里插入图片描述

y = np.array([np.take(x[i], np.argsort(x[i])) for i in range(3)])
#numpy.take(a, indices, axis=None, out=None, mode='raise')沿轴从数组中获取元素。
print(y)

#[[ 2.55  6.53  8.93  9.86]
# [ 5.53  5.8   6.22 10.  ]
# [ 0.02  2.08  4.85  4.96]]

【例1-3】numpy.lexsort照某一指标进行排序

numpy.lexsort(keys[, axis=-1]) Perform an indirect stable sort using a sequence of keys.（使用键序列执行间接稳定排序。）

给定多个可以在电子表格中解释为列的排序键，lexsort返回一个整数索引数组，该数组描述了按多个列排序的顺序。序列中的最后一个键用于主排序顺序，倒数第二个键用于辅助排序顺序，依此类推。keys参数必须是可以转换为相同形状的数组的对象序列。如果为keys参数提供了2D数组，则将其行解释为排序键，并根据最后一行，倒数第二行等进行排序

#按照第一列的升序或者降序对整体数据进行排序。
x = np.random.rand(3,3) * 10
x = np.around(x, 2)
print(x)
#[[3.41 1.23 8.99]
 [9.31 3.03 3.63]
 [0.78 0.85 4.22]]

index = np.lexsort([x[:, 0]]) #首列数字排序顺序
print(index)
y = x[index]
print(y)

index = np.lexsort([-1 * x[:, 0]])#首列数字倒序排序
print(index)
y = x[index]
print(y)

在这里插入图片描述
keys值传入2d数组时：

import numpy as np
x = np.array([1, 5, 1, 4, 3, 4, 4])
y = np.array([9, 4, 0, 4, 0, 2, 1])

z = np.lexsort([y, x])
print(z)
# [2 0 4 6 5 3 1]
print(x[z])
# [1 1 3 4 4 4 5]

z = np.lexsort([x, y])
print(z)
# [2 4 6 5 3 1 0]
print(y[z])
# [0 0 1 2 4 4 9]

【例1-4】numpy.partition(a, kth, axis=-1, kind=‘introselect’, order=None)

以索引是 kth 的元素为基准，将元素分成两部分，即大于该元素的放在其后面，小于该元素的放在其前面，这里有点类似于快排。

x = np.random.randint(1, 30, [8, 3])
print(x)
# [[ 9 25 4]
# [ 8 24 16]
# [17 11 21]
# [ 3 22 3]
# [ 3 15 3]
# [18 17 25]
# [16 5 12]
# [29 27 17]]

y = np.sort(x, axis=0) #0轴方向排序
print(y)
# [[ 3 5 3]
# [ 3 11 3]
# [ 8 15 4]
# [ 9 17 12]
# [16 22 16]
# [17 24 17]
# [18 25 21]
# [29 27 25]]

z = np.partition(x, kth=2, axis=0)
print(z)
# [[ 3 5 3]
# [ 3 11 3]
# [ 8 15 4]
# [ 9 22 21]
# [17 24 16]
# [18 17 25]
# [16 25 12]
# [29 27 17]]

选取每一列第三小的数

x = np.random.randint(1, 30, [8, 3])
print(x)
# [[ 9 25 4]
# [ 8 24 16]
# [17 11 21]
# [ 3 22 3]
# [ 3 15 3]
# [18 17 25]
# [16 5 12]
# [29 27 17]]
z = np.partition(x, kth=2, axis=0)
print(z[2])
# [ 8 15 4]

选取每一列第三大的数据

import numpy as np
np.random.seed(100)
x = np.random.randint(1, 30, [8, 3])
print(x)
# [[ 9 25 4]
# [ 8 24 16]
# [17 11 21]
# [ 3 22 3]
# [ 3 15 3]
# [18 17 25]
# [16 5 12]
# [29 27 17]]
z = np.partition(x, kth=-3, axis=0)
print(z[-3])
# [17 24 17]

【例1-5】numpy.argpartition(a, kth, axis=-1, kind=‘introselect’, order=None)

x = np.random.randint(1, 30, [8, 3])
print(x)
# [[ 9 25 4]
# [ 8 24 16]
# [17 11 21]
# [ 3 22 3]
# [ 3 15 3]
# [18 17 25]
# [16 5 12]
# [29 27 17]]

y = np.argsort(x, axis=0)
print(y)
# [[3 6 3]
# [4 2 4]
# [1 4 0]
# [0 5 6]
# [6 3 1]
# [2 1 7]
# [5 0 2]
# [7 7 5]]
z = np.argpartition(x, kth=2, axis=0)
print(z)
# [[3 6 3]
# [4 2 4]
# [1 4 0]
# [0 3 2]
# [2 1 1]
# [5 5 5]
# [6 0 6]
# [7 7 7]]

选取每一列第三小的数的索引

x = np.random.randint(1, 30, [8, 3])
print(x)
# [[ 9 25 4]
# [ 8 24 16]
# [17 11 21]
# [ 3 22 3]
# [ 3 15 3]
# [18 17 25]
# [16 5 12]
# [29 27 17]]
z = np.argpartition(x, kth=2, axis=0)
print(z[2])
# [1 4 0]

二、搜索

【例2-1】numpy.argmax(a[, axis=None, out=None])

import numpy as np
a = np.array([[1, 5, 5, 2],
              [9, 6, 2, 8],
              [3, 7, 9, 1]])
b=np.argmax(a, axis=0)#对二维矩阵来讲a[0][1]会有两个索引方向，第一个方向为a[0]，默认按列方向搜索最大值
#a的第一列为1，9，3,最大值为9，所在位置为1，
#a的第一列为5，6，7,最大值为7，所在位置为2，
#此此类推，因为a有4列，所以得到的b为1行4列，
print(b)#[1 2 2 1]
 
c=np.argmax(a, axis=1)#现在按照a[0][1]中的a[1]方向，即行方向搜索最大值，
#a的第一行为1，5，5，2,最大值为5（虽然有2个5，但取第一个5所在的位置），索引值为1，
#a的第2行为9，6，2，8,最大值为9，索引值为0，
#因为a有3行，所以得到的c有3个值，即为1行3列
print(c)#[1 0 2]

【例2-2】numpy.argmin(a[, axis=None, out=None])

返回最小值的索引

und(x, 2)
print(x)
# [[2.32 7.54 9.78 1.73 6.22]
# [6.93 5.17 9.28 9.76 8.25]
# [0.01 4.23 0.19 1.73 9.27]
# [7.99 4.97 0.88 7.32 4.29]
# [9.05 0.07 8.95 7.9 6.99]]

y = np.argmin(x)
print(y) # 10
y = np.argmin(x, axis=0)
print(y)
# [2 4 2 0 3]
y = np.argmin(x, axis=1)
print(y)
# [3 1 0 2 1]

【例2-3】numpy.where(condition, [x=None, y=None])

满足条件 condition ，输出 x ，不满足输出 y 。

import numpy as np
x = np.arange(10)
print(x)
# [0 1 2 3 4 5 6 7 8 9]
y = np.where(x < 5, x, 10 * x)
print(y)
# [ 0 1 2 3 4 50 60 70 80 90]
x = np.array([[0, 1, 2],
			[0, 2, 4],
			[0, 3, 6]])
y = np.where(x < 4, x, -1)
print(y)
# [[ 0 1 2]
# [ 0 2 -1]
# [ 0 3 -1]]

只有 condition ，没有 x 和 y ，则输出满足条件 (即非0) 元素的坐标 (等价于 numpy.nonzero )。这里的坐标以tuple的形式给出，通
常原数组有多少维，输出的tuple中就包含几个数组，分别对应符合条件元素的各维坐标。

x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
y = np.where(x > 5)
print(y)
# (array([5, 6, 7], dtype=int64),)
print(x[y])
# [6 7 8]

x = np.array([[11, 12, 13, 14, 15],
			[16, 17, 18, 19, 20],
			[21, 22, 23, 24, 25],
			[26, 27, 28, 29, 30],
			[31, 32, 33, 34, 35]])
y = np.where(x > 25)
print(y)
# (array([3, 3, 3, 3, 3, 4, 4, 4, 4, 4], dtype=int64), array([0, 1, 2, 3, 4, 0, 1, 2, 3, 4], dtype=int64))
print(x[y])
# [26 27 28 29 30 31 32 33 34 35]

【例2-4】numpy.searchsorted(a, v[, side=‘left’, sorter=None])

a. a：一维输入数组。当 sorter 参数为 None 的时候， a 必须为升序数组；否则， sorter 不能为空，存放 a 中元素的 index ，用于
反映 a 数组的升序排列方式。
b. v：插入 a 数组的值，可以为单个元素， list 或者 ndarray 。
c. side：查询方向，当为 left 时，将返回第一个符合条件的元素下标；当为 right 时，将返回最后一个符合条件的元素下标。
d. sorter：一维数组存放 a 数组元素的 index，index 对应元素为升序。

x = np.array([0, 1, 5, 9, 11, 18, 26, 33])
y = np.searchsorted(x, 15)
print(y) # 5  #索引位置
y = np.searchsorted(x, 15, side='right')
print(y) # 5
y = np.searchsorted(x, 0, side='right')
print(y) # 1
y = np.searchsorted(x, 33)
print(y) # 7
y = np.searchsorted(x, 33, side='right')
print(y) # 8

#插入数组
x = np.array([0, 1, 5, 9, 11, 18, 26, 33])
y = np.searchsorted(x, [-1, 0, 11, 15, 33, 35])
print(y) # [0 0 4 5 7 8] 索引位置
y = np.searchsorted(x, [-1, 0, 11, 15, 33, 35], side='right')
print(y) # [0 1 5 5 8 8]


x = np.array([0, 1, 5, 9, 11, 18, 26, 33])
np.random.shuffle(x)
print(x) # [33  0  5 11  1  9 18 26]
x_sort = np.argsort(x)  #索引代替排序结果
print(x_sort) # [1 4 2 5 3 6 7 0]
y = np.searchsorted(x, [-1, 0, 11, 15, 33, 35], sorter=x_sort)
print(y) # [0 0 4 5 7 8]

三、计数

【例3-1】numpy.count_nonzero(a, axis=None)

返回数组中的非0元素个数。

x = np.count_nonzero(np.eye(4))
print(x) # 4
x = np.count_nonzero([[0, 1, 7, 0, 0], [3, 0, 0, 2, 19]])
print(x) # 5
x = np.count_nonzero([[0, 1, 7, 0, 0], [3, 0, 0, 2, 19]], axis=0)
print(x) # [1 1 1 1 1]
x = np.count_nonzero([[0, 1, 7, 0, 0], [3, 0, 0, 2, 19]], axis=1)
print(x) # [2 3]