Quicksort
- Shuffle the array.
- Partition so that, for some j.
- Sort each piece recursively.
Phase I. Repeat until i and j pointers cross.
- Scan i from left to right so long as (a[i] < <script type="math/tex" id="MathJax-Element-1"><</script> a[lo])
- Scan j from right to left so long as (a[j] > a[lo])
- Exchange a[i] with a[j]
Phase II. When pointers cross.
Exchange a[lo] with a[j]. 因为j指向的是从右向左数第一个小于target的值
private static int partition(Comparable[] a, int lo, int hi) { int i = lo, j = hi+1; // 注意初始值 while (true) { while (less(a[++i], a[lo])) if (i == hi) break; while (less(a[lo], a[--j])) if (j == lo) break; // This test is redundant. if (i >= j) break; exch(a, i, j); } exch(a, lo, j); return j; } public class Quick { private static int partition(Comparable[] a, int lo, int hi) { /* see previous slide */ } public static void sort(Comparable[] a) { StdRandom.shuffle(a); sort(a, 0, a.length - 1); } private static void sort(Comparable[] a, int lo, int hi) { if (hi <= lo) return; int j = partition(a, lo, hi); sort(a, lo, j-1); sort(a, j+1, hi); } }
Quicksort: implementation details
Partitioning in-place. Using an extra array makes partitioning easier
(and stable), but is not worth the cost.
Terminating the loop. Testing whether the pointers cross is a bit trickier than it might seem.????
Staying in bounds. The (j == lo) test is redundant (why?), but the (i == hi) test is not.
Preserving randomness. Shuffling is needed for performance guarantee.
Equal keys. When duplicates are present, it is (counter-intuitively) better to stop on keys equal to the partitioning item’s key. ????
Average case. Number of compares is ~ 1.39 N lg N.
- 39% more compares than mergesort.
- But faster than mergesort in practice because of less data movement.
Caveat emptor. Many textbook implementations go quadratic if array
- Is sorted or reverse sorted.
- Has many duplicates (even if randomized!)
Quicksort is not stable.
Quicksort: practical improvements
Insertion sort small subarrays.
- Cutoff to insertion sort for ≈ 10 items.
Median of sample.
- Best choice of pivot item = median.
- Estimate true median by taking median of sample.
Median-of-3 (random) items.
private static void sort(Comparable[] a, int lo, int hi)
{
if (hi <= lo) return;
int m = medianOf3(a, lo, lo + (hi - lo)/2, hi);
swap(a, lo, m);
int j = partition(a, lo, hi);
sort(a, lo, j-1);
sort(a, j+1, hi);
}
Selection
Goal. Given an array of
Applicates.
- Order statistics.
- Find the “top k”.
Use theory as a guide.
- Easy N log N upper bound. How?
- Easy N upper bound for k = 1, 2, 3. How?
- Easy N lower bound. Why?
Partition array so that:
- Entry a[j] is in place.
- No larger entry to the left of j.
- No smaller entry to the right of j.
Repeat in one subarray, depending on j; finished when j equals k.
Quick-select takes linear time on average.
public static Comparable select(Comparable[] a, int k)
{
StdRandom.shuffle(a);
int lo = 0, hi = a.length - 1;
while (hi > lo)
{
int j = partition(a, lo, hi);
if (j < k) lo = j + 1;
else if (j > k) hi = j - 1;
else return a[k];
}
return a[k];
}
Duplicate keys
Algorithm goes quadratic unless partitioning stops on equal keys!
3-way partitioning
Goal. Partition array into 3 parts so that:
- Entries between lt and gt equal to partition item v.
- No larger entries to left of lt.
- No smaller entries to right of gt.
Dutch national flag problem.
Let v be partitioning item a[lo].
Scan i from left to right.
- (a[i] < v): exchange a[lt] with a[i]; increment both lt and i
- (a[i] > v): exchange a[gt] with a[i]; decrement gt
- (a[i] == v): increment i
Dijkstra 3-way partitioning demo
private static void sort(Comparable[] a, int lo, int hi)
{
if (hi <= lo) return;
int lt = lo, gt = hi;
Comparable v = a[lo];
int i = lo;
while (i <= gt)
{
int cmp = a[i].compareTo(v);
if (cmp < 0) exch(a, lt++, i++);
else if (cmp > 0) exch(a, i, gt--);
else i++;
}
sort(a, lo, lt - 1);
sort(a, gt + 1, hi);
}
Bottom line. Randomized quicksort with 3-way partitioning reduces running time from linearithmic to linear in broad class of applications.
System Sorts
Basic algorithm = quicksort.
- Cutoff to insertion sort for small subarrays.
- Partitioning scheme: Bentley-McIlroy 3-way partitioning.
- Partitioning item.
small arrays: middle entry
medium arrays: median of 3
large arrays: Tukey’s ninther
Tukey’s ninther. Median of the median of 3 samples, each of 3 entries.
Better partitioning than random shuffle and less costly.
- Approximates the median of 9.
- Uses at most 12 compares.