Introduction to Algorithms--Part 1 and 2
- Part 1:Foundations
- Part 2: Sorting and Order Statistics
Part 1:Foundations
Loop invariants(循环不变式): help judging the rightness of algorithm
Chapter 2: Getting Started
2.1 insert sorting
2.2 analyze algorithm
- the worst running time: Θ ( n 2 ) \Theta(n^2) Θ(n2)
2.3 design algorithm
-
insert sorting using incremental method(增量方法)
complexity: Θ ( n 2 ) \Theta(n^2) Θ(n2) -
divide and conquer(分治法)
Here is a skill to sort cards using this way. place a sentinel card(哨兵卡) to avoid comparing every times.
complexity: Θ ( n lg n ) \Theta(n\lg^n) Θ(nlgn)
Chapter 3: Growth of Functions
3.1 asymptotic sign(渐近记号)
Θ
\Theta
Θ: asymptotically tight bound (渐近紧确界), pronounced: Theta
O
\Omicron
O: asymptotically upper bound (渐近上界), pronounced: O
Ω
\Omega
Ω: asymptotically lower bound (渐近下界), pronounced: Omega
Chapter 4: Divide-and-Conquer
4.1 maximum subarray problem
- using divide and conquer method: Θ ( n l g n ) \Theta(nlgn) Θ(nlgn)
- in some case, only use Θ ( n ) \Theta(n) Θ(n) to solve the problem
4.2 Strassen algorithm of matrix multiplication
- In practical, sparse matrix (稀疏矩阵) have special algorithms.
- In practical, the fast multiple program of dense matrix use Strassen algorithm when the matrix size exceeds a “intersection” value. Once the size is reduced below the intersection value, program will choose a simpler method.
4.3 Solving recurrence(求解递归式)
- Substitution method(代入法)
guess a bound. Then prove it by mathematical induction(数学归纳法) - Recursive tree method(递归树法)
Convert recurrence to recursive tree. Then solve it. - Master Theorem(主定理)
provides a recipe solution for the following form of recurrence:
T ( n ) = a T ( n / b ) + f ( n ) T(n)=aT(n/b)+f(n) T(n)=aT(n/b)+f(n)
a ≥ 1 a≥1 a≥1 and b > 1 b>1 b>1 are constants, f ( n ) f(n) f(n) is asymptotic positive function, T ( n ) T(n) T(n) is a recurrence defined on nonnegative integers.
== For a constant ϵ > 0 , f ( n ) = O ( n log b a − ϵ ) \epsilon>0, f(n)=\Omicron(n^{\log_b^{a-\epsilon}}) ϵ>0,f(n)=O(nlogba−ϵ) , then T ( n ) = Θ ( n log b a ) T(n)=\Theta(n^{\log_b^a}) T(n)=Θ(nlogba).
== f ( n ) = Θ ( n log b a ) f(n)=\Theta(n^{\log_b^a}) f(n)=Θ(nlogba), then T ( n ) = Θ ( n log b a lg n ) T(n)=\Theta(n^{\log_b^a}\lg^n) T(n)=Θ(nlogbalgn).
== For a constant ϵ > 0 , f ( n ) = Ω ( n log b a + ϵ ) \epsilon>0,f(n)=\Omega(n^{\log_b^{a+\epsilon}}) ϵ>0,f(n)=Ω(nlogba+ϵ), and, for a constant c < 1 c<1 c<1 and all enough big n, have a f ( n / b ) ≤ c f ( n ) af(n/b)≤cf(n) af(n/b)≤cf(n), then T ( n ) = Θ ( f ( n ) ) T(n)=\Theta(f(n)) T(n)=Θ(f(n)).
Chapter 5: Probabilistic Analysis and Randomized Algorithms
5.2 Indicator Random Variable(指示器随机变量)
Part 2: Sorting and Order Statistics
Chapter 6: Heapsort(堆排序)
6.1 Heap
As figure 6-1, (binary) heap is array, it can be regarded as binary tree. The root of tree is A[1], so give a element’s index i, we can calculate its parent / left child / right child node:
PARENT(i)
1 return floor(i / 2)
LEFT(i)
1 return 2i
RIGHT(i)
1 return 2i + 1
binary heap has two forms: max-heap (最大堆) and min-heap (最小堆).
max-heap: all node except root need to satisfy
min-heap: all node except root node need to satisfy
The common operations involving heaps are:
- MAX-HEAPIFY(A, i): complexity O ( l g n ) \Omicron(lgn) O(lgn), key of maintain max-heap property.
- BUILD-MAX-HEAP(A): complexity O ( n ) \Omicron(n) O(n), function is constructing a max-heap from a disorder array A.
- HEAPSORT(A): complexity O ( n l g n ) \Omicron(nlgn) O(nlgn), function is sorting a array A in place.
- MAX-HEAP-INSERT / HEAP-EXTRACT-MAX / HEAP-INCREASE-KEY / HEAP-MAXIMUM: complexity O ( l g n ) \Omicron(lgn) O(lgn), function is use heap to construct a priority-queue
The C++ Standard Library provides the make_heap, push_heap and pop_heap algorithms for heaps.
Chapter 7: Quicksort
Quicksort is usually best choice in practice. The reason is below:
- good average performance: expected time complexity Θ ( n l g n ) \Theta(nlgn) Θ(nlgn), and the implicit constant factor is very small.
- sorting in place
- work well in virtual memory environment
Chapter 8: Sorting in Linear Time
8.3 Counting Sort(计数排序)
Array A[1…n] is input. B[1…n] store the result of sorting, C[0…k] is count array. complexity: Θ ( n ) \Theta(n) Θ(n).
COUNTING-SORT(A, B, k)
1 let C[0..k] = 0 be a new array
2 for j = 1 to A.length
3 C[A[j]] = C[A[j]] + 1
4 // C[i] now contains the number of elements equal to i.
5 for i = 1 to k
6 C[i] = C[i] + C[i-1]
7 // C[i] now contains the number of elements less than or equal to i.
8 for j = A.length downto 1 // note: this is for the stability of counting sort
9 B[C[A[j]]] = A[j]
10 C[A[j]] = C[A[j]] - 1
the figure is the process of above pseudo code.
8.3 Radix Sort(基数排序)
8.4 Bucket Sort(桶排序)
Chapter 9: Medians and Order Statistics
The i-th order statistic(顺序统计量) is the i-th smallest element of set.
9.1 Minimum and Maximum
When we want to find minimum and maximum, there is a skill that only need 3*floor(n/2) comparisons at most.
Handling input element in pair
- compare a pair of input elements, get smaller value and higher value
- compare smaller value with current minimum
- compare higher value with current maximum
9.2 Select Algorithm of Linear Time Expectation
The function can return i-th smallest element of array A[p…r]. Expection running time: Θ ( n ) \Theta(n) Θ(n).
RANDOMIZED-SELECT(A, p, r, i)
1 if p == r
2 return A[p]
3 q = RANDOMIZED-PARTITION(A, p, r) // the function is the same as randonmized quick sort algorithm
4 k = q - p + 1
5 if i == k // the privot value is the answer
6 return A[q]
7 else if i < k
8 return RANDOMIZED-SELECT(A, p, q-1, i)
9 else
10 return RANDOMIZED-SELECT(A, q+1, r, i-k)
9.3 Select Algorithm of Linear Time in Worst Case
Through follow steps, SELECT algorithm can determine i-th smallest element.
- dividing n elements of input array into floor(n / 5) groups, every groups have 5 elements, the last group has (n mod 5) elements.
- find medians of the ceil(n/5) groups: firstly using insert sort for each group, then determine medians of the groups
- For the ceil(n/5) medians finding in step 2, recursively call SELECT to find the median x of the medians(If the number is even, convention x is the smaller median for convenient).
- use modified PARTITION function to part input array by median x. Let k be one more than elements’ number in lower part of the partition, so the x is the k-th smallest element, and have (n - k) elements in high part.
- If i = k, return x. If i < k, recursively call SELECT to find i-th smallest element in lower part. If i > k, recursively find (i-k)-th smallest element in high part.
note: modified PARTITION function is from determine PARTITION of quick-sort. The modification is taking pivot element as input parameter.
This is my realization. Just simple test, may be have problem.
#include <vector>
#include <list>
using namespace std;
int PARTITION(vector<int>& a, int p, int r, int x)
{
int j = p-1, k = p-1;
for (int i = p; i <= r; ++i)
{
if (a[i] < x)
{
++j;
++k;
int tmp = a[j];
a[j] = a[i];
a[i] = a[k];
a[k] = tmp;
}
else if (a[i] == x)
{
++k;
int tmp = a[i];
a[i] = a[k];
a[k] = tmp;
}
}
return j + 2 - p;
}
// O(n^2)
void insertSort(vector<int> &a)
{
for (int i = 1; i < a.size(); ++i)
{
for (int j = i - 1; j >= 0; --j)
{
if (a[j+1] < a[j])
{
int tmp = a[j+1];
a[j+1] = a[j];
a[j] = tmp;
}
else
break;
}
}
}
int SELECT(vector<int> &a, int p, int r, int i)
{
vector<int> medians;
if (p == r) return a[p];
if (p > r) return 0;
vector<int> tmp;
for (int i = p; i <= r; ++i)
{
if (tmp.size() == 5)
{
insertSort(tmp);
medians.push_back(tmp[2]);
tmp.clear();
}
tmp.push_back(a[i]);
}
// calculate the last group median
if (!tmp.empty())
{
insertSort(tmp);
int med = (tmp.size() - 1) / 2;
medians.push_back(tmp[med]);
}
// find median of medians
int rightBound = medians.size() - 1;
int x = SELECT(medians, 0, rightBound, rightBound / 2);
int k = PARTITION(a, p, r, x);
if (i == k)
return x;
else if (i < k)
return SELECT(a, p, p+k-2, i);
else
return SELECT(a, p+k, r, i-k);
}
int main()
{
vector<int> a = {1, 7, 3, 1, 0, 9};
int outValue = SELECT(a, 0, a.size()-1, 4);
return 0;
}