读书笔记："算法导论"之RANDOMIZED-SELECT（快速选择算法）

最新推荐文章于 2020-05-12 12:41:05 发布

QilongPan

最新推荐文章于 2020-05-12 12:41:05 发布

阅读量3.9k

点赞数 4

分类专栏： java基础知识文章标签： java

java基础知识专栏收录该内容

63 篇文章 1 订阅

订阅专栏

缘由

由于一道题需要使用这个方法处理，题目：

输入n 个整数，输出其中最小的k 个。
例如输入1，2，3，4，5，6，7 和8 这8 个数字，则最小的4 个数字为1，2，3 和4。

该算法，即RANDOMIZED-SELECT算法，可以以时间复杂度为O（n）的要求完成。
我不打算证明，需要看证明的请

参阅：程序员编程艺术：第三章、寻找最小的k个数

此外，该题其他的解法请看：

总结：来自"v_JULY_v"的微软面试100题（2010年）的第五题

原理

算法导论一书是： Introduction to Algorithms Third Edition（已上传网盘），其第二版的中文版广为人知：算法导论

Quicksort

必须先简单的讲一下快速排序，因为RANDOMIZED-SELECT排序需要使用到快速排序。
直接看书上是如何讲解快速排序的：
（摘自7.1 Description of quicksort）

Quicksort, like merge sort （归并排序）, applies the divide-and-conquer paradigm introduced in Section 2.3.1. Here is the three-step divide-and-conquer process for sorting a
typical subarray A[p...r]:

Divide: Partition (rearrange，后一个词中文上更好理解) the array A[p...r] into two (possibly empty) subarrays A[p...q-1] and A[q+1..r] such that each element of A[p...q-1] is less than or equal to A[q], which is, in turn, less than or equal to each element of A[q+1..r]. Compute the index q as part of this partitioning procedure.
Conquer: Sort the two subarrays A[p...q-1] and A[q+1..r] by recursive calls
Combine: Because the subarrays are already sorted, no work is needed to combine them: the entire array A[p...r] is now sorted.

我认为可以强调几点：

基本思路就是选择一个元素作为中心，然后将比这个数小的元素放在这个中心的左边，比这个数大的元素放在这个中心的右边，然后对中心左边和右边的两个数组再递归调用快排算法。
中心一旦确定了左右两边的数，它就已经处于最终位置了。

伪代码如下，传入参数：QUICKSORT（A，1，A.length）也就是数组，第一个元素的下标，最后一个元素的下标（注意这个伪代码的数组是从1开始）

补充：

很好的体现了递归
q是一个下标

对于上述过程，又抽象又难理解，请看下面用实际例子来体会：

The operation of PARTITION on a sample array. Array entry A[r] becomes the pivot element x. Lightly shaded array elements are all in the first partition with values no greater than x. Heavily shaded elements are in the second partition with values greater than x. The unshaded elements have not yet been put in one of the first two partitions, and the final white element is the pivot x.

(a) The initial array and variable settings. None of the elements have been placed in eitherof the first two partitions.
(b) The value 2 is “swapped with itself” and put in the partition of smaller values.
(c)–(d) The values 8 and 7 are added to the partition of larger values.
(e) The values 1 and 8 are swapped, and the smaller partition grows.
(f) The values 3 and 7 are swapped, and the smaller partition grows.
(g)–(h) The larger partition grows to include 5 and 6, and the loop terminates.
(i) In lines 7–8, the pivot element is swapped so that it lies between the two partitions.

RANDOMIZED PARTITION

(摘自7.3 A randomized version of quicksort)

In exploring the average-case behavior(平均表现/一般表现) of quicksort, we have made an assumption that all permutations（排列） of the input numbers are equally likely(可能性相同). In an engineering situation, however, we cannot always expect this assumption to hold. (See Exercise 7.2-4.) As we saw in Section 5.3, we can sometimes add randomization to an algorithm in order to obtain good expected performance over all inputs. Many people regard the resulting randomized version of quicksort as the sorting algorithm of choice（作为选择）for large enough inputs.

In Section 5.3, we randomized our algorithm by explicitly permuting（明确地改变了） the input. We could do so for quicksort also, but a different randomization technique,called random sampling(随机采样), yields a simpler analysis. Instead of always using A[r] as the pivot, we will select a randomly chosen element from the subarray A[p...r].We do so by first exchanging element A[r] with an element chosen at random from A[p...r]. By randomly sampling the range p...r, we ensure that the pivot element x=A(r) is equally likely to be any of the r-p+1 elements in the subarray. Because we randomly choose the pivot element, we expect the split of the input array to be reasonably well balanced on average.

The changes to PARTITION and QUICKSORT are small. In the new partition procedure, we simply implement the swap before actually partitioning:

伪代码如下：

RANDOMIZED-SELECT

该算法其实为了对付一种情况产生的：在一个无序的数组中寻找第K小(大)的数。如果这个问题解决了，实际上也解决了文首的问题，只需要再遍历一遍就好了。

核心思想：我们发现快速排序算法，在完成一次计算之后，枢轴处于了最终位置，并且我们已经了左边都比它小，右边的都比它大。换句话，它就是第几个大的数，也可以说是第几小的数。所以可以用以应对上述情况。

(伪代码摘自9.2 Selection in expected linear time)

A为数组，p为数组第一个元素，r为最后一个元素，i为需要求的第几小的元素

4句的k表示处于a[q]是第几小的数
第7是关键：如果i<k，说明i在q的左边，否则就是在右边。所以递归使用RANDOMIZED-SELECT方法。
注意第9句，因为i>k，所以i在q的右边，在右边的子数组中，所求的数就不再是第i小的了，而是第i-k小的。

特点：

RANDOMIZED-SELECT只用对一边递归即可
快速排序的时间复杂度是O（n*logn），但是RANDOMIZED-SELECT的时间复杂度为：O（n）。证明请看："Introduction to Algorithms Third Edition"的216页。

代码

[java] view plain copy

/**
* 根据算法导论的伪代码，完成快速排序快速选择的代码。
* @author zy
*
*/
public class randomizedSelect {
/**
* @param args
*/
public static void main(String[] args) {
int a[]={2,5,3,0,2,3,0,3};
a=quickSort(a,0,a.length-1);
System.out.print("排序结果：");
for(int i=0;i<a.length;i++){
System.out.print(a[i]+" ");
}
int result=randomizedSelect(a,0,a.length-1,3);//产生第三小的数
System.out.print("\n"+result);
}
private static int[] quickSort(int[] a,int p,int r){
if(p<r){
int q=partition(a,p,r);
quickSort(a,p,q-1);
quickSort(a,q+1,r);
}
return a;
}
private static int partition(int[] a, int p, int r) {
int x=a[r];
int i=p-1;
for(int j=p;j<r;j++){
if(a[j]<=x){
i=i+1;
swap(a, i, j);
}
}
swap(a, i+1, r);
return i+1;
}
private static int[] randomizedquickSort(int[] a,int p,int r){
if(p<r){
int q=randomizedPartition(a,p,r);
randomizedPartition(a,p,q-1);
randomizedPartition(a,q+1,r);
}
return a;
}
private static int randomizedPartition(int[] a,int p,int r){
java.util.Random random = new java.util.Random();
int i=Math.abs(random.nextInt() % (r-p+1)+p);//产生指定范围内的随机数
swap(a,i,r);
return partition(a,p,r);
}
/**
*
* @param a 数组
* @param p 数组的第一个元素
* @param r 数组的最后一个元素
* @param i 需要求第几小的元素
* @return
*/
private static int randomizedSelect(int[] a,int p,int r,int i){
if(p==r){
return a[p];//这种情况就是数组内只有一个元素
}
int q=randomizedPartition(a,p,r);
int k=q-p+1;//拿到上一句中作为枢纽的数是第几小的数
if(i==k){
return a[q];
}else if(i<k){
return randomizedSelect(a,p,q-1,i);
}else{
return randomizedSelect(a,q+1,r,i-k);
}
}
private static void swap(int[] a, int i, int j) {
int temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}

执行结果：

排序结果：0 0 2 2 3 3 3 5
2

另一份用java实现的代码请看：

《算法导论的Java实现》 10 中位数和顺序统计学

优秀博客

chen09的专栏：根据本书伪代码，写了java代码
简单_快速选择算法(RANDOMIZED-SELECT)：帮助我理解了这个问题，实际上讲解的BFPRT算法的原理

源代码

与代码处一模一样：

[java] view plain copy

/**
* 根据算法导论的伪代码，完成快速排序快速选择的代码。
* @author zy
*
*/
public class randomizedSelect {
/**
* @param args
*/
public static void main(String[] args) {
int a[]={2,5,3,0,2,3,0,3};
a=quickSort(a,0,a.length-1);
System.out.print("排序结果：");
for(int i=0;i<a.length;i++){
System.out.print(a[i]+" ");
}
int result=randomizedSelect(a,0,a.length-1,3);//产生第三小的数
System.out.print("\n"+result);
}
private static int[] quickSort(int[] a,int p,int r){
if(p<r){
int q=partition(a,p,r);
quickSort(a,p,q-1);
quickSort(a,q+1,r);
}
return a;
}
private static int partition(int[] a, int p, int r) {
int x=a[r];
int i=p-1;
for(int j=p;j<r;j++){
if(a[j]<=x){
i=i+1;
swap(a, i, j);
}
}
swap(a, i+1, r);
return i+1;
}
private static int[] randomizedquickSort(int[] a,int p,int r){
if(p<r){
int q=randomizedPartition(a,p,r);
randomizedPartition(a,p,q-1);
randomizedPartition(a,q+1,r);
}
return a;
}
private static int randomizedPartition(int[] a,int p,int r){
java.util.Random random = new java.util.Random();
int i=Math.abs(random.nextInt() % (r-p+1)+p);//产生指定范围内的随机数
swap(a,i,r);
return partition(a,p,r);
}
/**
*
* @param a 数组
* @param p 数组的第一个元素
* @param r 数组的最后一个元素
* @param i 需要求第几小的元素
* @return
*/
private static int randomizedSelect(int[] a,int p,int r,int i){
if(p==r){
return a[p];//这种情况就是数组内只有一个元素
}
int q=randomizedPartition(a,p,r);
int k=q-p+1;//拿到上一句中作为枢纽的数是第几小的数
if(i==k){
return a[q];
}else if(i<k){
return randomizedSelect(a,p,q-1,i);
}else{
return randomizedSelect(a,q+1,r,i-k);
}
}
private static void swap(int[] a, int i, int j) {
int temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}

QilongPan

关注

4
点赞
踩
15

收藏

觉得还不错? 一键收藏
1
评论
读书笔记："算法导论"之RANDOMIZED-SELECT（快速选择算法）

缘由由于一道题需要使用这个方法处理，题目：输入n 个整数，输出其中最小的k 个。例如输入1，2，3，4，5，6，7 和8 这8 个数字，则最小的4 个数字为1，2，3 和4。该算法，即RANDOMIZED-SELECT算法，可以以时间复杂度为O（n）的要求完成。我不打算证明，需要看证明的请参阅：程序员编程艺术：第三章、寻找最小的k个数此外，该题其
复制链接

扫一扫