Sorting Introduction
Goal. Sort any type of data.
Callback = reference to executable code
- Client passes array of objects to sort() function.
The sort() function calls back object’s compareTo() method as needed.
Java: interfaces
- C: function pointers.
C++: class-type functors.
public interface Comparable {
public int compareTo(Item that);
}public class File implements Comparable {
public int compareTo(File b); // -1 0 1
}public static void sort(Comparable[] a);
Selection Sort
In iteration i, find index min of smallest remianing entry.
Swap a[i] and a[min].
Running time insensitive to input. Quadratic time, even if input is sorted.
Data movement is minimal. Linear number of exchanges.
Insertion Sort
In interation i, swap a[i] with each larger entry to its left.
An inversion is a pair of keys that are out of order.
An array is partially sorted if the number of inversions is <= cN.
For partially-sorted arrays, insertion sort runs in linear time.
Number of exchanges equals the number of inversions.
Shell Sort
Idea. Move entries more than one position at a time by h-sorting the array.
Insertion sort, with stride length h.
Which increment sequence to use?
3x+1 increment, 1, 4, 13, 40
Sedgewick. Tough to beat in empirical studies. 1, 5, 19, 41, 109, 209
merging of (9∗4i)−(9∗2i)+1 and 4i−(3∗2i)+1
Analysis. Accurate model has not yet been descovered.
Why are we interested in shell sort?
Example of simple idea leading to substantial performance gains.
Useful in practice.
- Fast unless array size is huge.
- Tiny, fixed footprint for code (used in embeded systems).
- hardware sort prototype.
Shuffling
- Generate a random real number for each array entry.
- sort the array.
Proposition. Shuffle sort produces a uniformly random permutation of the input array, provided no duplicate values.
Knuth shuffle
- In iteration i, pick integer r between 0 and i uniformly at random.
- Swap a[i] and a[r].
Proposition. [Fisher-Yates 1938] Knuth shuffling algorithm produces a uniformly random permutation of the input array in linear time.
public class StdRandom
{
...
public static void shuffle(Object[] a)
{
int N = a.length;
for (int i = 0; i < N; i++)
{
int r = StdRandom.uniform(i + 1);
exch(a, i, r);
}
}
}
Online Poker
for i := 1 to 52 do begin
r := random(51) + 1;
swap := card[r];
card[r] := card[i];
card[i] := swap;
end;
- Bug 1. Random number r never 52 ⇒ 52nd card can’t end up in 52nd place.
- Bug 2. Shuffle not uniform (should be between 1 and i).
- Bug 3. random() uses 32-bit seed ⇒ 232 possible shuffles.
- Bug 4. Seed = milliseconds since midnight ⇒ 86.4 million shuffles.
Best practices for shuffling (if your business depends on it).
- Use a hardware random-number generator that has passed both
the FIPS 140-2 and the NIST statistical test suites. - Continuously monitor statistic properties:
hardware random-number generators are fragile and fail silently. - Use an unbiased shuffling algorithm.
Convex Hull
The convex hull of a set of N points is the smallest perimeter fence enclosing the points.
Graham scan
- Choose point p with smallest y-coordinate.
- Sort points by polar angle with p.
- Consider points in order; discard unless it create a counterclockwise turn.