numpy TASK5 排序搜索和计数

最新推荐文章于 2024-06-06 08:10:55 发布

m0_49190544

最新推荐文章于 2024-06-06 08:10:55 发布

阅读量174

点赞数

文章标签： numpy

本文链接：https://blog.csdn.net/m0_49190544/article/details/109404572

版权

TASK5 排序搜索和计数

5.1 排序
- 5.1.1 numpy.sort（）
- 5.1.2 numpy.argsort()
5.2 搜索
5.3 计数
- 5.3.1 numpy.count_nonzero()
5.4 集合

5.1 排序

5.1.1 numpy.sort（）

numpy.sort(a[, axis=-1, kind=‘quicksort’, order=None]) 排序

axis：排序沿数组的（轴）方向，0表示按行，1表示按列，None表示展开来排序，默认为-1，表示沿最后的轴排序。

kind：排序的算法，提供了快排’quicksort’、混排’mergesort’、堆排’heapsort’，默认为‘quicksort’。

order：排序的字段名，可指定字段排序，默认为None。

np.random.seed(20200612)
x = np.random.rand(5, 5) * 10
x = np.around(x, 2)
print(x)
[[2.32 7.54 9.78 1.73 6.22]
[6.93 5.17 9.28 9.76 8.25]
[0.01 4.23 0.19 1.73 9.27]
[7.99 4.97 0.88 7.32 4.29]
[9.05 0.07 8.95 7.9 6.99]]
y = np.sort(x)
print(y)
[[1.73 2.32 6.22 7.54 9.78]
[5.17 6.93 8.25 9.28 9.76]
[0.01 0.19 1.73 4.23 9.27]
[0.88 4.29 4.97 7.32 7.99]
[0.07 6.99 7.9 8.95 9.05]]
y = np.sort(x, axis=0)
print(y)
[[0.01 0.07 0.19 1.73 4.29]
[2.32 4.23 0.88 1.73 6.22]
[6.93 4.97 8.95 7.32 6.99]
[7.99 5.17 9.28 7.9 8.25]
[9.05 7.54 9.78 9.76 9.27]]
y = np.sort(x, axis=1)
print(y)
[[1.73 2.32 6.22 7.54 9.78]
[5.17 6.93 8.25 9.28 9.76]
[0.01 0.19 1.73 4.23 9.27]
[0.88 4.29 4.97 7.32 7.99]
[0.07 6.99 7.9 8.95 9.05]]

按指定规则排序

dt = np.dtype([('name', 'S10'), ('age', np.int)])
a = np.array([("Mike", 21), ("Nancy", 25), ("Bob", 17), ("Jane", 27)], dtype=dt)
print('-'*10+'按照姓名排序'+'-'*10)
b = np.sort(a, order='name')
print(b)
print('-'*10+'按照年龄排序'+'-'*10)
b = np.sort(a, order='age')
print(b)
----------按照姓名排序----------
[(b'Bob', 17) (b'Jane', 27) (b'Mike', 21) (b'Nancy', 25)]
----------按照年龄排序----------
[(b'Bob', 17) (b'Mike', 21) (b'Nancy', 25) (b'Jane', 27)]

5.1.2 numpy.argsort()

numpy.argsort(a[, axis=-1, kind=‘quicksort’, order=None]) 返回对数组进行排序的索引

x = np.random.randint(0, 10, 10)
print('没有排序之前')
print(x)
print('排序后返回的索引')
y = np.argsort(x)
print(y)
print('按照索引取值后')
print(x[y])
没有排序之前
[2 6 1 9 8 4 5 0 5 5]
排序后返回的索引
[7 2 0 5 6 8 9 1 4 3]
按照索引取值后
[0 1 2 4 5 5 5 6 8 9]

numpy.lexsort() 使用键序列执行间接稳定排序

#按照第一列的大小进行排序
x = np.random.rand(5, 5) * 10
x = np.around(x, 2)
print(x)
[[9.14 8.2 7.39 4.22 0.2 ]
[4.48 8.68 5.55 2.62 1.92]
[6.09 4.34 9.97 3.2 1.4 ]
[0.51 0.9 3.08 3.75 0.84]
[6.37 7.01 5.07 2.91 0.64]]
index = np.lexsort([x[:, 0]])
print(index)
[3 1 2 4 0]
y = x[index]
print(y)
[[0.51 0.9 3.08 3.75 0.84]
[4.48 8.68 5.55 2.62 1.92]
[6.09 4.34 9.97 3.2 1.4 ]
[6.37 7.01 5.07 2.91 0.64]
[9.14 8.2 7.39 4.22 0.2 ]]

numpy.partition(a, kth, axis=-1, kind=‘introselect’, order=None)
以索引是 kth 的元素为基准，将元素分成两部分，即大于该元素的放在其后面，小于该元素的放在其前面，这里有点类似于快排。

x = np.random.randint(1, 30, [8, 3])
print(x)
[[19 21 26]
[14 7 12]
[21 23 7]
[27 14 12]
[13 1 8]
[11 26 17]
[13 20 8]
[ 7 22 9]]
y = np.sort(x, axis=0)
print(y)
[[ 7 1 7]
[11 7 8]
[13 14 8]
[13 20 9]
[14 21 12]
[19 22 12]
[21 23 17]
[27 26 26]]
z = np.partition(x, kth=2, axis=0)
print(z)

numpy.argpartition(a, kth, axis=-1, kind=‘introselect’, order=None)
使用kind关键字指定的算法，沿给定轴执行间接分区。它返回一个与相同形状的索引数组，该数组按分区顺序沿给定轴对数据进行索引。

x = np.random.randint(1, 30, [8, 3])
print(x)
[[ 9 25 4]
[ 8 24 16]
[17 11 21]
[ 3 22 3]
[ 3 15 3]
[18 17 25]
[16 5 12]
[29 27 17]]
y = np.argsort(x, axis=0)
print(y)
[[3 6 3]
[4 2 4]
[1 4 0]
[0 5 6]
[6 3 1]
[2 1 7]
[5 0 2]
[7 7 5]]
z = np.argpartition(x, kth=2, axis=0)
print(z)
[[3 6 3]
[4 2 4]
[1 4 0]
[0 3 2]
[2 1 1]
[5 5 5]
[6 0 6]
[7 7 7]]

5.2 搜索

5.2.1 numpy.argmax()

numpy.argmax(a[, axis=None, out=None])返回轴上最大值的索引。

x = np.random.rand(5, 5) * 10
x = np.around(x, 2)
print(x)
[[5.62 0.06 3.07 9.5 1.27]
[0.79 3.11 6.32 6.99 6.42]
[9.2 2.99 5.69 1.79 5.33]
[6.47 1.42 5.81 4.79 3.86]
[4.4 4.05 4.42 0.3 7.76]]
#返回最大值所在的索引
y = np.argmax(x)
print(y)
3   #最大值是9.5 所以返回的索引是

5.2.2 numpy.argmin()

numpy.argmin(a[, axis=None, out=None])返回轴上最小值的索引。

x = np.random.rand(5, 5) * 10
x = np.around(x, 2)
print(x)
[[9.41 2.61 7.99 6.16 7.66]
[9.72 2.25 5.94 7.64 0.71]
[7.98 7.66 0.93 1.27 6.04]
[9.64 3.52 6.69 2.85 6.47]
[2.82 8.18 5.46 5.61 4.54]]
y = np.argmin(x)
print(y)
9    #最小值是0.71 是第10个数，因此返回的索引是9

5.2.3 numppy.nonzero()

numppy.nonzero(a) 返回非零元素的索引。，其值为非零元素的下标在对应轴上的值。

1.只有a中非零元素才会有索引值，那些零值元素没有索引值。
2.返回一个长度为a.ndim的元组（tuple），元组的每个元素都是一个整数数组（array）。
3.每一个array均是从一个维度上来描述其索引值。比如，如果a是一个二维数组，则tuple包含两个.array，第一个array从行维度来描述索引值；第二个array从列维度来描述索引值。
4.该 np.transpose(np.nonzero(x)) 函数能够描述出每一个非零元素在不同维度的索引值。
5.通过a[nonzero(a)]得到所有a中的非零值。

x = np.array([0, 2, 3])
print(x)
[0 2 3]
y = np.nonzero(x)
print(y)
(array([1, 2], dtype=int64),)  #因为2 和 3 都是非零的 索引返回的是索引 1 2

5.2.4 numpy.where()

numpy.where(condition, [x=None, y=None]) 根据条件返回从x或y中选择的元素。

#满足条件输出X，不满足条件输出y
x = np.arange(10)
print(x)
[0 1 2 3 4 5 6 7 8 9]
#比如我们的要求是 x<5 不动输出 大于5的数 乘以10
y = np.where(x < 5, x, 10 * x)
print(y)
[ 0 1 2 3 4 50 60 70 80 90]

5.2.5 numpy.searchsorted()

numpy.searchsorted(a, v[, side=‘left’, sorter=None]) 查找应该插入元素以维持顺序的索引。
1.a：一维输入数组。当sorter参数为None的时候，a必须为升序数组；否则，sorter不能为空，存放a中元素的index，用于反映a数组的升序排列方式。
2.v：插入a数组的值，可以为单个元素，list或者ndarray。
3.side：查询方向，当为left时，将返回第一个符合条件的元素下标；当为right时，将返回最后一个符合条件的元素下标。
4.sorter：一维数组存放a数组元素的 index，index 对应元素为升序。

x = np.array([0, 1, 5, 9, 11, 18, 26, 33])
#把15插入到数组 然后数据不变 那么这个15就是要插入到11-18之间，因此索引为第5号
y = np.searchsorted(x, 15)
print(y)
5

5.3 计数

5.3.1 numpy.count_nonzero()

numpy.count_nonzero(a, axis=None) 计算数组a中非零值的个数。

x = np.count_nonzero(np.eye(4))
print(x)
4
x = np.count_nonzero([[0, 1, 7, 0, 0], [3, 0, 0, 2, 19]])
print(x) 
5

5.4 集合

5.4.1 numpy.unique()

numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None) 找出数组中唯一的元素。
1.return_index=True 表示返回新列表元素在旧列表中的位置。
2.return_inverse=True表示返回旧列表元素在新列表中的位置。
3.return_counts=True表示返回新列表元素在旧列表中出现的次数。

x = np.unique([1, 1, 3, 2, 3, 3])
print(x)
[1 2 3]
x = np.array([[1, 1], [2, 3]])
u = np.unique(x)
print(u)
[1 2 3]

5.4.2 numpy.intersectld()

求2个集合的交集

x = np.array([1, 1, 2, 3, 4])
y = np.array([2, 1, 4, 6])
xy, x_ind, y_ind = np.intersect1d(x, y, return_indices=True)
print(x_ind)  
print(y_ind) 
print(xy)  
print(x[x_ind])  
print(y[y_ind])  
[0 2 4]
[1 0 2]
[1 2 4]
[1 2 4]
[1 2 4]

5.4.3numpy.unionld()

求2个集合的并集

import numpy as np
from functools import reduce
x = np.union1d([-1, 0, 1], [-2, 0, 2])
print(x) 
x = reduce(np.union1d, ([1, 3, 4, 3], [3, 1, 2, 1], [6, 3, 4, 2]))
print(x)
[-2 -1 0 1 2]
[1 2 3 4 6]

5.4.4 numpy.setdiffld()

求2个集合的差集

a = np.array([1, 2, 3, 2, 4, 1])
b = np.array([3, 4, 5, 6])
x = np.setdiff1d(a, b)
print(x)
[1 2]

5.4.5numpy.setxorld()

求2个集合的异或

a = np.array([1, 2, 3, 2, 4, 1])
b = np.array([3, 4, 5, 6])
x = np.setxor1d(a, b)
print(x)
[1 2 5 6]

5.4.6 numpy.in1d()

numpy.in1d(ar1, ar2, assume_unique=False, invert=False) 测试一维数组中的每个元素是否也存在于第二个数组中。返回一个与ar1长度相同的布尔数组，当ar1的元素在ar2中时为真，否则为假。

test = np.array([0, 1, 2, 5, 0])
states = [0, 2]
mask = np.in1d(test, states)
print(mask)  
print(test[mask]) 
test = np.array([0, 1, 2, 5, 0])
states = [0, 2]
mask = np.in1d(test, states)
print(mask) # [ True False True False True]
print(test[mask]) # [0 2 0]