Background, sorting algorithms are mainly discussed in two different scenes.
1. Internal sorting, the data set fits the main memory and all the data can be read or written at
any time.
2. External sorting, the data set is too large to fit our main memory and is usually stored in a disk
or tape. Used when you can only access one part of them.
In this article, we focus on internal sorting algorithms, which are important in many coding
contests.One tip,
Stable algorithms preserve the original relative order of keys in same value.
1. complexity in O(n^2)
Bubble sort
Several passes through the whole list until the list is sorted. In each pass, swap adjacent elements that are inverted.
void bubble( int[] list) {
int i;
boolean done = false;
while (done == false) {
done = true;
for (i = 1; i < list.length; i++) {
if (list[i-1] > list[i]) {
done = false;
swap(list, i-1, i);
}
}
}
}
Selection sort
move through the list and find the next smallest element and put it on the next beginning position.
void selection (int[] list) {
int i, j, min;
for (i = 0; i < list.length - 1; i++) {
min = i;
for (j = 1; j < list.length; j++) {
if (list[min] > list[j]) {
min = j;
}
}
if (min != i) {
swap(list, min, i);
}
}
}
about selection sorting: not much better than bubble sort, just reduce the amount of swap operations.
Insertion sort
Look at the new entry one and put it in the correct position among previous entries.
void insertion( int[] list) {
int i, j, temp;
for (i=1; i<list.length; i++) {
temp = list[i];
for (j=i; j>0 & list[j-1]>temp; j--) {
list[j] = list[j-1];
}
list[j] = temp;
}
}
this algorithm is very fast if the list is already sorted or the list length is small, because the constant hidden in the big theta is really small.
All of these three algorithms are with running time complexity O(n^2) and constant extra memory.
2. Complexity in O(nlogn)
Merge sort
Basically, it is a divide and conquer problem.
Recursive base is: if n == 1, return;
else,
- merge sort the left half list;
- merge sort the right half list;
- combine left and right list.
Last combination is linear operation but with extra memory Theta(n).
Usually used as the basis of many external sorting algorithms where the intermediate results or many files need to be combined.
merge sort in LinkedList without head.
Node* mergesortList(Node* head) {
int n = 0, count = 1;
Node *p, *q, *temp, *temphead, *pprev;
for (p = head; p != NULL; p=p->pNext) {
n++;
}
if (n == 1) {
return head;
}
for (p = head, q = head; count < n/2; ) {
q = q->pNext;
count++;
}
temp = q->pNext;
q->pNext = NULL;
q = temp;
p = mergesortList(p);
q = mergesortList(q);
temphead = new Node(0);
temphead->pNext = p;
pprev = temphead;
while (p != NULL && q != NULL) {
if (p->data > q->data) {
temp = q->pNext;
q->pNext = p;
pprev->pNext = q;
pprev = q;
q = temp;
}
else {
pprev = p;
p = p->pNext;
}
}
if (p == NULL) {
pprev->pNext = q;
}
return temphead->pNext;
}
Quick sort
First, quick sort is still a divide and conquer problem, and its average running time is of O(nlogn), however, the worst case can be O(n^2). In order to avoid worst case, we need to pick up the pivot carefully. Random selection or pick up the median of the list is really expensive operation. Our recommendation is pick up the median of first, middle and last element, which is practical.
However, for a short list, as we said before, insertion sort is more effective. Our solution is, using quick sort for large lists, but switch to insertion sort when the size is less than a cutoff value.
A cutoff value is usually 10 to 20, and using a cutoff can help to remove the problem of get median of three partitions of a list with less than three elements.
Basic algorithm:
1. select the pivot p, and swap it to the last element of the list. i = 0, j = n-2;
2. scan from the left, using index i, until find the element q, q>p ;
3. scan from the right, using index j, until find the element r, r < p;
4. if i >= j, swap p to position i, and divide the original list into two sublists, one is from 0 to j, the other one is from i+1 to n-1;
5. if i < j, swap value in i and j, and go on step 2 and 3, until the condition in step 4 is satisfied.
quick sort in array:
vector<int>& quicksort(vector<int>& nums, int begin, int end) {
int n = end - begin + 1;
if (n == 1) return nums;
int temp;
if (n == 2) {
if (nums[begin] > nums[end]) {
temp = nums[begin];
nums[begin] = nums[end];
nums[end] = temp;
}
return nums;
}
int pivot;
int first=nums[begin], middle=nums[(end)/2], last=nums[end];
if ((first - middle)*(first - last) <= 0) {
pivot = first;
nums[begin] = last;
nums[end] = pivot;
}
if ((middle - first)*(middle - last) <= 0) {
pivot = middle;
nums[end/2] = last;
nums[end] = pivot;
}
if ((last - middle)*(last - first) <= 0) {
pivot = last;
nums[end] = pivot;
}
int i=begin, j=end -1;
while ( i < j ) {
while (i < end) {
if (nums[i] > pivot)
break;
else
i++;
}
while (j >= begin) {
if (nums[j] <= pivot)
break;
else
j--;
}
if (i < j) {
temp = nums[i];
nums[i] = nums[j];
nums[j] = temp;
}
else {
nums[end] = nums[i];
nums[i] = pivot;
}
}
if (j>=begin)
nums = quicksort(nums, begin, j);
if (i<end)
nums = quicksort(nums, i+1, end);
return nums;
}
In this quick sort code sample, I picked the median of first, middle and last element in array to be my pivot in order to avoid time limit exceeded; also, I just operated recursion on the array by passing positions of begin and end to be parameters, in order to save memory. Both of this two points are important, or you will fail anywhere.