Medians and Order Statistics - Introduction to Algorithm - Summary of Chapter 9

Introduction of Chapter

The ith order statistic of a set of n elements is the ith smallest element.

A median, informally, is the “halfway point” of the set.
If the number of set if even , the lower median is (n+1)/2 smallest element, and the upper median is (n+1)/2 smallest element.


Minimum and maximum

Minimum (A)
    min = A[1]
    for i=2 to A.length
        if min > A[i]
            min = A[i]
    return min

Simultaneous minimum and maximum
We compare pairs of elements from the input first with each other, and then we compare the smaller with the current minimum and the larger to the current maximum, at a cost of 3 comparisons for every 2 elements. Finding both the minimum and the maximum using at most 3n/2 comparisons.


Selection in expected linear time

Randomized-Select(A, p, r, i)
    if p==r
        return A[p]
    q = Randomized-Partition (A, p, r)
    k = q-p+1
    if i==k
        return A[q]
    elseif i<k
        return Randomized-Select (A, p, q-1, i)
    else return Randomized-Select (A, q+1, r, i-k)

The expected running time for Randomized-Select is Θ(n)

The worst-case running time for Randomized-Select is Θ(n2) even to find the minimum, because we could be extremely unlucky and always partition around the largest remaining element, and partitioning takes Θ(n) time.


Selection in worst-case linear time

The SELECT algorithm determines the ith smallest of an input array of n>1 distinct elements by executing the following steps. (If n=1 , then SELECT merely returns its only input value as the ith smallest.)

  1. Divide the n elements of the input array into n/5 groups of 5 elements each and at most one group made up of the remaining nmod5 elements.
  2. Find the median of each of the n/5 groups by first insertion-sorting the elements of each group (of which there are at most 5) and then picking the median from the sorted list of group elements.
  3. Use SELECT recursively to find the median x of the n/5 medians found in step 2. (If there are an even number of medians, then by our convention, x is the lower median.)
  4. Partition the input array around the median-of-medians x using the modified version of PARTITION. Let k be one more than the number of elements on the low side of the partition, so that x is the kth smallest element and there are n-k elements on the high side of the partition.
  5. If i = k, then return x. Otherwise, use SELECT recursively to find the ith smallest element on the low side if i < k, or the .(i-k)th smallest element on the high side if i > k.

The show of the select execution


Exercises

9.1.1
Show that the second smallest of n elements can be found with n+lgn2 comparisons in the worst case. (Hint: Also find the smallest element.)

We can compare the elements in a tournament fashion - we split them into pairs, compare each pair and then proceed to compare the winners in the same fashion. We need to keep track of each “match” the potential winners have participated in.

We select a winner in n - 1 matches. At this point, we know that the second smallest element is one of the lgn elements that lost to the smallest ­ each of them is smaller than the ones it has been compared to,(These lgn1 elements is the one of which was compared with the smallest one and was bear) prior to losing. In another lgn1 comparisons we can find the smallest element out of those. This is the answer we are looking for.


9.1.2
Prove the lower bound of 3n/22 comparisons in the worst case to find both the maximum and minimum of n numbers. (Hint: Consider how many numbers are potentially either the maximum or minimum, and investigate how a comparison affects these counts.)

We can optimize by splitting the input in pairs and comparing each pair. After n/2 comparisons, we have reduced the potential minimums and potential maximums to n/2 each. In that moment , which bigger ,which smaller is coming . Furthermore, the bigger one and the smaller one of first pair is viewed the maximum and the minimum in the algorithm beginning. There are two comparisions each of the (n1) remainder pair, the bigger one is compared with maximum and the smaller is compared with minimum.
The total number of comparisons is:
n/2+2(n/21)=n/2+n2=3n/22
This assumes that n is even. If n is odd we need one additional comparison in order to determine whether the last element is a potential minimum or maximum. Hence the ceiling.

9.2.1
Show that RANDOMIZED-SELECT never makes a recursive call to a 0-length array.

The are two cases where it appears that RANDOMIZED-SELECT can make a call to a 0-length array:

  1. Line 8 with k=1 . But for this to happen, i needs to be 0. And that cannot happen since the initial call is supposed to pass a nonzero i and the recursive calls either pass i unmodified or pass ik where i>k .
  2. Line 9 with q=r . But for this to happen, i must be greater than k, that is i>qp+1=rp+1 , that is, i needs to be greater than the number of elements in the array. Initially that is not true and both recursive calls maintain an invariant that i is less or equal to the number of elements in A[p..q] .

9.2.2
Argue that the indicator random variable Xk and the value
T(max(k1,nk)) are independent

Picking the pivot in one partitioning does not affect the probabilities of the subproblem. That is, the call to RANDOM in RANDOMIZED-PARTITION produces a result, independent from the call in the next iteration.

9.2.3
Write an iterative version of RANDOMIZED-SELECT.

#include <stdlib.h>

static int tmp;
#define EXCHANGE(a, b) { tmp = a; a = b; b = tmp; }

int randomized_partition(int *A, int p, int r);

int randomized_select(int *A, int p, int r, int i) {
    while (p < r - 1) {
        int q = randomized_partition(A, p, r);
        int k = q - p;

        if (i == k) {
            return A[q];
        } else if (i < k) {
            r = q;
        } else {
            p = q + 1;
            i = i - k - 1;
        }
    }

    return A[p];
}

int partition(int *A, int p, int r) {
    int x, i, j;

    x = A[r - 1];
    i = p;

    for (j = p; j < r - 1; j++) {
        if (A[j] < x) {
            EXCHANGE(A[i], A[j]);
            i++;
        }
    }

    EXCHANGE(A[i], A[r - 1]);

    return i;
}

int randomized_partition(int *A, int p, int r) {
    int pivot = rand() % (r - p) + p;
    EXCHANGE(A[pivot], A[r - 1]);
    return partition(A, p, r);
}

9.2.4
Suppose we use RANDOMIZED-SELECT to select the minimum element of the array A=3,2,9,0,7,5,4,8,6,1 . Describe a sequence of partitions that results in a worst-case performance of RANDOMIZED-SELECT.

This happens if all the elements get picked up in reverse order ­ that is, the first pivot chosen is 9, the second is 8, the third is 7 and so on.


9.3.1
In the algorithm SELECT, the input elements are divided into groups of 5. Will the algorithm work in linear time if they are divided into groups of 7? Argue that SELECT does not run in linear time if groups of 3 are used.

Groups of 7
The algorithm will work if the elements are divided in groups of 7. On each partitioning, the minimum number of elements that are less than (or greater than) x will be:

4(12n72)2n78

The partitioning will reduce the subproblem to size at most 5n/7 + 8. This yields the following recurrence:

T(n)={O(1)T(n/7)+T(5n/7+8)+O(n) if n<n0 if nn0

We guess T(n) \le cn and bound the non-recursive term with an:

T(n)cn/7+c(5n/7+8)+ancn/7+c+5cn/7+8c+an=6cn/7+9c+an=cn+(cn/7+9c+an)cn=O(n)

The last step holds when (cn/7+9c+an)0 . That is:

cn/7+9c+an0c(n/79)anc(n63)7anc7ann63

By picking n0=126 and nn0 , we get that n/(n63)2 . Then we just need c14a .

Groups of 3
The algorithm will not work for groups of three. The number of elements that are less than (or greater than) the median-of-medians is:

2(12n32)n34

The recurrence is thus:

T(n)=T(n/3)+T(2n/3+4)+O(n)

We’re going to prove that T(n) = \omega(n) using the substitution method. We guess that T(n) > cn and bound the non-recursive term with an.

T(n)>cn/3+c(2n/3+2)+an>cn/3+c+2cn/3+2c+an=cn+3c+an>cn=ω(n)(c>0,a>0,n>0)

The calculation above holds for any c>0 .

9.3.2
Analyze SELECT to show that if n140 , then at least n/4 elements are greater than the median-of-medians x and at least dn=4e elements are less than x.

3n106n43n106n4+13n107n412n28010n2n280n140

9.3.3
Show how quicksort can be made to run in O(nlgn) time in the worst case, assuming that all elements are distinct.

If we rewrite PARTITION to use the same approach as SELECT, it will perform in O(n) time, but the smallest partition will be at least one-fourth of the input (for large enough n , as illustrated in exercise 9.3.2). This will yield a worst-case recurrence of:

T(n)=T(n/4)+T(3n/4)+O(n)

As of exercise 4.4.9, we know that this is Θ(nlgn) .

And that’s how we can prevent quicksort from getting quadratic in the worst case, although this approach probably has a constant that is too large for practical purposes.

Another approach would be to find the median in linear time (with SELECT) and partition around it. That will always give an even split.

9.3.4
Suppose that an algorithm uses only comparisons to find the ith smallest element in a set of n elements. Show that it can also find the i−1 smaller elements and the n−i larger elements without performing any additional comparisons.

A strict proof might require a more advanced proof apparatus than I command (like graphs and adversary algorithms, for example?), so I will just sketch it briefly.

In order to determine the ith order statistic, any algorithm needs to establish in some way that there are i−1 elements smaller than the result and n−i elements larger than the result. We can visualize the algorithm as a directed graph, where all the elements are edges. Each comparison introduces a node from the smaller to the larger element. To produce a result, there must be i−1 elements that (transitively) point to the ith order statistic and n−i elements that the ith order statistic (transitively) points to. There cannot be more (property of the order statistics) and if there are less, then there are elements whose position in regard to the ith order statistic is undetermined.

In order to find the result, the algorithm needs to build the knowledge presented in such a graph and it can use it to return the sets of smaller and larger elements.

As an example, both algorithms presented in the chapter leave the array partitioned around the ith order statistic.

9.3.5
Suppose that you have a “black-box” worst-case linear time median subroutine. Give a simple, linear-time algorithm that solves the selection problem for an arbitrary order statistic.

We find the median in linear time partition the array around it (again, in linear time). If the median index (always n/2 ) equals n we return the median. Otherwise, we recurse either in the lower or upper part of the array, adjusting n accordingly.

This yields the following recurrence:

T(n)=T(n/2)+O(n)

Applying the master method, we get an upper bound of O(n) .

int select (A, p, r, i)//return the ith smallest element in array A[p..r]
{
    if (r<=p)
        return A[p];
    int k, q;
    k = median (A, p, r);//k is the position of the A[(r-p+1)/2+p],median A[k]
    if (i==k)
        return A[i];
    else if (i<k)
        select (A, p, k-1, i);
    else select (A, k+1, r, i-k);
}

9.3.6
The k th quantiles of an n-element set are k1 order statistics that divide the sorted set into k equal-sized sets (to within 1). Give an O(nlgk)-time algorithm to list the k th quantiles of a set.

If k=1 we return an empty list.
If k is even, we find the median, partition around it, solve two similar subproblems of size n/2 and return their solutions plus the median.
If k is odd, we find the k/2 and k/2 boundaries and the we reduce to two subproblems, each with size less than n/2 . The worst case recurrence is:

T(n,k)=2T(k/2)+O(n)

Which is the desired bound ­ O(nlgk) .

This works easily when the number of elements is ak+k1 for a positive integer a . When they are a different number, some care with rounding needs to be taken in order to avoid creating two segments that differ by more than 1.

9.3.7
Describe an O(n) -time algorithm that, given a set S of n distinct numbers and a positive integer k n, determines the k numbers in S that are closest to the median of S.

  1. We find the median of the array in linear time
  2. We find the distance of each other element to the median in linear time
  3. We find the k-th order statistic of the distance, again, in linear time
  4. We select only the elements that have distance lower than or equal to the k-th order statistic

9.3.8
Let X[1..n] and Y[1..n] be two arrays, each containing n numbers already in sorted order. Give an O(lgn) time algorithm to find the median of all 2n elements in arrays X and Y .

  1. If the two arrays are of length 1, we pick the lower of the two elements
  2. We compares the two medians of the array
  3. We take the lower part of the array with the greater median and the upper part of the array with the lesser median to a new smaller array with size of half. If each array has n elements, we take the first/last n/2 elements
  4. We solve the problem for the new arrays recursively

Let’s reason about why this works. Since we have 2n elements, we know that the length is an even number and we’re looking for a lower median. We need to observe that the median we’re looking for is between the medians of the two arrays. Let’s elaborate on that.

Assume that the median is at position k in array A. This means that there are k−1 elements less than the median in A and k−1 elements greater than the median in B. If k<n/2 then the median of A will be greater than the final median, but the median of B will be lesser than it. It’s the other way around for k≥n/2. Thus the median of the two arrays is always between the medians of each.

Step 3 removes the same number of elements from each array, half of which are greater than the median and half of which are less than it. This reduces the subproblem to two smaller arrays that are sorted and their elements have the same median.

9.3.9
Professor Olay is consulting for an oil company, which is planning a large pipeline running east to west through an oil fields of n wells. The company wants to connect a spur pipeline from each well directly to the main pipeline along a shortest route (either north or south), as shown on Figure 9.2. Given the x- and y- coordinates of the wells, how should the professor pick the optimal location of the main pipeline, which would be the one that minimizes the total length of the spurs? Show how to determine the optimal location in linear time.

We just find the median of the y coordinates. The x coordinates are irrelevant. The pipeline between any two wellspring can not decide the length . because it is decided only by the positions of two well. The important thing is to make sure the number of well beside pipeline is equal.

Let’s assume that n is odd. There are ⌊n/2⌋ south of the median and the same amount of wells north of the median. Let the pipeline pass through the median. We shall reason about why this location is optimal.

Suppose we move the pipeline one meter north. This reduces the total pipe length with ⌊n/2⌋ meters for the pipes north of the median, but adds another ⌈n/2⌉ for the pipes south of median, including the median itself. The more we move north, the more the total pipe length will increase. The same reasoning holds if we move the main pipeline south.

If n is even, then any location between the lower and upper median is optimal.


Problems

9.1 Largest i numbers in sorted order
Given a set of n numbers, we wish to find the i largest in sorted order using a comparison based algorithm. Find the algorithm that implements each of the following methods with the best asymptotic worst-case running time, and analyze the running time of the algorithms in terms of n and i.

a. Sort the numbers, and list the i largest

We can sort with any of the nlgn algorithms, that is, merge sort or heap sort and then just take the first i elements linearly.
This will take nlgn+i time.

b. Build a max-priority queue from the numbers, and call EXTRACT-MAX i times

We can build the heap linearly and then take each of the largest i elements in logarithmic time.
This takes n+ilgn .

c. Use an order-statistic algorithm to find the ith largest number, partition around that number, and sort the i largest numbers.

Let’s assume we use the SELECT algorithm from the chapter. We can find the ith order statistic and partition around it in n time and then we need to do a sort in ilgi .
This takes n+ilgi

9.2 Weighted median
For n distinct elements x1,x2,,xn with positive weights w1,w2,,wn such that ni=1wi=1 , the weighted (lower) median is the element xk satisfying

xi<xkwi<12

and
xi>xkwi12

For example, if the elments are 0.1,0.35,0.05,0.1,0.15,0.05,0.2 and
each element equals its weight (that is, wi=xi for i=1,2,,7 , then the median is 0.1 , but the weighted median is 0.2 .

a. Argue that the median of x1,x2,,xn is the weighted median of xi with weights wi=1/n for i=1,2,,n .

If the weights all elements are 1/n , then the sum of the weights of the elements, smaller than the median, is n121n and the sum of the weights of the larger elements is n121n . This satisfies the condition for weighted median. Furthermore, choosing a smaller or greater value will not hold in the condition

b. Show how to compute the weighted median of n elements in O(nlgn) worst-case time using sorting.

  1. Sort the array
  2. Start walking the array from left to right, accumulating the weights of the elements encountered
  3. The first element with accumulated weight w1/2 is the weighted median

c. Show how to compute the weighted median in Θ(n) worst-case time using a linear-time median algorithm such as SELECT from Section 9.3.

It’s a very similar to SELECT. In the step 4, we partition the whole array into two part by the median of median x. Then we sum the weights in the lower part of the array and the weights in the upper part. If they fulfill the condition, we have our weighted median. Otherwise we find it recursively in one of two part .There is a place worthy to notice that the recursive range is [q..r], but we compute the part weight sum in [1..A.length] every time.

d. Argue that the weighted median is a best solution for the 1-dimensional post-office location problem, in which points are simply real numbers and the distance between points a and b is d(a,b)=|ab| .

I’ll present an informal argument, since it is convincing enough. A more formal one can be found in the instructor’s manual.

The situation is similar to exercise 9.3.8. Let’s assume that we pick the weighted median as the solution and then start moving left or right. As we move away from the weighted median (in any direction), we’re moving towards elements with combined weight less than 1/2 and away from elements wight combined weight greater than 1/2. Every “step” we take, we’re increasing the total distance.

e. Find the best solution for the 2-dimensional post-office location problem, in which the points are (x,y) coordinate pairs and the distance between points a=(x1,y1) and b=(x2,y2) is the Manhattan distance given by d(a,b)=|x1x2|+|y1y2| .

The solution is finding (xm,ym) where those are the weighted medians of
the x - and y- coordinates.

I’m not even going to start proving this formally, since it requires
mathematics above my current comfort level. Reasoning informally, by the definition of Manhattan distance, the x coordinates and the y coordinates are independent ­ we can rearrange the x in any way we want, without affecting the y coordinate of the solution and vice-versa.

9.3 Small order statistics
We showed that the worst-case number T(n) of comparisons used by SELECT to select the ith order statistic from n numbers satisfies T(n)=Θ(n) , but the constant hidden by the Θ -notation is rather large. When i is small relative to n , we can implement a different procedure that uses SELECT as a subroutine but makes fewer comparisons in the worst case.

a. Describe an algorithm that uses Ui(n) comparisons to find the ith smallest of n elements, where

Ui(n)={T(n)n/2+Ui(n/2)+T(2i)if in/2otherwise

(Hint: Begin with n/2 disjoint pairwise comparisons, and recurse on the set containing the smaller element from each pair.)

This is a modified version of SELECT. Not only it finds the ith order statistic, but it also partitions the array, thus finding the i−1 smaller elements.

  1. If in/2 , we just use SELECT
  2. Otherwise, split the array in pairs and compare each pair.
  3. We take the smaller elements of each pair, but keep track of the other one.
  4. We recursively find the first i elements among the smaller elements
  5. The ith order statistic is among the pairs containing the smaller elements we found in the previous step. We call SELECT on those 2i elements. That’s the final answer.

Just picking the smaller element of each pair is not enough. For example, if we’re looking for the 2nd order statistic and our pairs are 1, 2. 3, 4. 5, 6. 7, 8. 9, 10. the answer is in the larger part of the first pair. That’s why we need to keep track and later perform SELECT on 2i elements.

Steps 1-4 can be implemented in place by modifying the algorithm to put the larger elements of the pairs on the inactive side of the pivot and modifying PARTITION to swap the elements on the inactive side every time it swaps elements on the active side. More details can be found in the Instructor’s Manual.

b. Show that, if i<n/2 , then Ui(n)=n+O(T(2i)lg(n/i)) .

Ui(n)=n/2+Ui(n/2)+T(2i)=n/2+n/2+O(T(2i)lg(n/2/i))+T(2i)=n+O(T(2i)lg(n/i))+O(1)=n+O(T(2i)lg(n/i))

This is a bit more sloppy that doing it with the substitution method, but that feels like grunt work to me at this point.

d. Show that, if i is a constant less than n/2, then Ui(n)=n+O(lgn) .

Ui(n)=n+O(T(2i)lg(n/i))=n+O(O(1)lg(n/i))=n+O(lgnlgi)=n+O(lgnO(1))=n+O(lgn)

e. Show that, if i=n/k for k2 , then Ui(n)=n+O(T(2n/k)lgk) .

Ui(n)=n+O(T(2i)lg(n/i))=n+O(T(2n/k)lg(n/(n/k)))=n+O(T(2n/k)lgk)

9.4 Alternative analysis of randomized selection
In this problem, we use indicator random variables to analyze the RANDOMIZED-SELECT procedure in a manner akin to our analysis of RANDOMIZED-QUICKSORT in section 7.4.2.

As in the quicksort analysis, we assume that all the elements are distinct, and we rename the elements of the input array A as z_1, z_2, \ldots, z_n, where z_i is the ith smallest element. Thus, the call RANDOMIZED-SELECT(A,1,n,k) returns z_k.

For ii<jn , let Xijk=I{zi is compared with zj sometime during the execution of the algorithm to find zk}

a. Give an exact expression for E[Xijk] . (Hint: Your expression may have different values, depending on the values of i, j, and k.)

The situation is very similar to the quicksort analysis, although k matters. zi and zj will be compared if one of them is the first element to get picked as a pivot in the smallest interval containing i, j and k. The exact expression depends on the position of k in regards to the other two:

E[Xijk]=1/(ki+1)21/(ji+1)21/(jk+1)2if i<jkif ikjif ki<j

b. Let Xk denote the total number of comparisons between elements of array A when finding zk . Show that

E[Xk]2(i=1kj=kn1ji+1+j=k+1njk1jk+1+i=1k2ki1ki+1)

It’s a long derivation:

E[Xk]=i=1n1j=i+1nE[Xijk]=i=1kj=i+1nE[Xijk]+i=k+1n1j=i+1nE[Xijk]=i=1k(j=i+1k1E[Xijk]+j=knE[Xijk])+i=k+1n1j=i+1nE[Xijk]=i=1kj=i+1k1E[Xijk]+i=1kj=knE[Xijk]+i=k+1n1j=i+1nE[Xijk]=i=1k2j=i+1k1E[Xijk]+i=1kj=knE[Xijk]+i=k+1n1j=i+1nE[Xijk]=i=1k2j=i+1k12ki+1+i=1kj=kn2ji+1+i=k+1n1j=i+1n2jk+1=2(i=1kj=kn1ji+1+i=k+1n1j=i+1n1jk+1+i=1k2j=i+1k11ki+1)=2(i=1kj=kn1ji+1+i=k+1n1j=i+1n1jk+1+i=1k2ki1ki+1)=2(i=1kj=kn1ji+1+j=k+2ni=k+1j11jk+1+i=1k2ki1ki+1)=2(i=1kj=kn1ji+1+j=k+2njk1jk+1+i=1k2ki1ki+1)2(i=1kj=kn1ji+1+j=k+1njk1jk+1+i=1k2ki1ki+1)(note below)

The last noted derivation is valid because of the following iversonian equation:

[k+1 \le i \le n - 1][i+1 \le j \le n] = [k+1 \le i < i + 1 < j \le n] = [k + 1 < j \le n][k + 1 \le i < j]
Concrete mathematics helped a lot!

c. Show that E[Xk]4n .

Let’s take the expressions in parts. The last two are straightforward enough:

j=k+1njk1jk+1+i=1k2ki1ki+1j=k+1n1+i=1k21=nk+k2n

This one is a bit trickier for me:

i=1kj=kn1ji+1

It contains terms of the form 1/m where 1 \le m \le n. It contains 1/1 at most once, 1/2 at most twice, 1/3 at most three times and so on. Thus, the sum of the expressions 1/m for each m is at most 1 and there are n such different expressions, which bounds the whole sum to n.

There should be a way to manipulate the sums to prove that, but I cannot find it. In any case, both expressions are at most 2n, which means that E[Xk]4n .

d. Conclude that, assuming all elements of array A are distinct, RANDOMIZED-SELECT runs in expected time O(n) .
Expectation of exchanging two elements

Well, it’s rather obvious, isn’t it? The number of operations in RANDOMIZED-SELECT are linear to the number of comparisons, and the expected number of comparisons are bound by a linear function, which means that the expected running time is linear.

Some of above content refere to “Introduction to Algorithm”and http://clrs.skanev.com/index.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
May 2001 Part I: Foundations Chapter List Chapter 1: The Role of Algorithms in Computing Chapter 2: Getting Started Chapter 3: Growth of Functions Chapter 4: Recurrences Chapter 5: Probabilistic Analysis and Randomized Algorithms Introduction This part will get you started in thinking about designing and analyzing algorithms. It is intended to be a gentle introduction to how we specify algorithms, some of the design strategies we will use throughout this book, and many of the fundamental ideas used in algorithm analysis. Later parts of this book will build upon this base. Chapter 1 is an overview of algorithms and their place in modern computing systems. This chapter defines what an algorithm is and lists some examples. It also makes a case that algorithms are a technology, just as are fast hardware, graphical user interfaces, objectoriented systems, and networks. In Chapter 2, we see our first algorithms, which solve the problem of sorting a sequence of n numbers. They are written in a pseudocode which, although not directly translatable to any conventional programming language, conveys the structure of the algorithm clearly enough that a competent programmer can implement it in the language of his choice. The sorting algorithms we examine are insertion sort, which uses an incremental approach, and merge sort, which uses a recursive technique known as "divide and conquer." Although the time each requires increases with the value of n, the rate of increase differs between the two algorithms. We determine these running times in Chapter 2, and we develop a useful notation to express them. Chapter 3 precisely defines this notation, which we call asymptotic notation. It starts by defining several asymptotic notations, which we use for bounding algorithm running times from above and/or below. The rest of Chapter 3 is primarily a presentation of mathematical notation. Its purpose is more to ensure that your use of notation matches that in this book than to teach you new mathematical concepts. Chapter 4 delves further into the divide-and-conquer method introduced in Chapter 2. In particular, Chapter 4 contains methods for solving recurrences, which are useful for describing the running times of recursive algorithms. One powerful technique is the "master method," which can be used to solve recurrences that arise from divide-and-conquer algorithms. Much of Chapter 4 is devoted to proving the correctness of the master method, though this proof may be skipped without harm.
中文名: 算法导论 原名: Introduction to Algorithms 作者: Thomas H.Cormen, 达特茅斯学院计算机科学系副教授 Charles E.Leiserson, 麻省理工学院计算机科学与电气工程系教授 Ronald L.Rivest, 麻省理工学院计算机科学系Andrew与Erna Viterbi具名教授 Clifford Stein, 哥伦比亚大学工业工程与运筹学副教授 资源格式: PDF(完整书签目录) 出版社: The MIT Press ISBN 978-0-262-03384-8 (hardcover : alk. paper)—ISBN 978-0-262-53305-8 (pbk. : alk. paper) 发行时间: 2009年09月30日 地区: 美国 语言: 英文 1 The Role of Algorithms in Computing 5 1.1 Algorithms 5 1.2 Algorithms as a technology 11 2 Getting Started 16 2.1 Insertion sort 16 2.2 Analyzing algorithms 23 2.3 Designing algorithms 29 3 Growth of Functions 43 3.1 Asymptotic notation 43 3.2 Standard notations and common functions 53 4 Divide-and-Conquer 65 4.1 The maximum-subarray problem 68 4.2 Strassen's algorithm for matrix multiplication 75 4.3 The substitution method for solving recurrences 83 4.4 The recursion-tree method for solving recurrences 88 4.5 The master method for solving recurrences 93 4.6 Proof of the master theorem 97 5 Probabilistic Analysis and Randomized Algorithms 114 5.1 The hiring problem 114 5.2 Indicator random variables 118 5.3 Randomized algorithms 122 5.4 Probabilistic analysis and further uses of indicator random variables 130 II Sorting and Order Statistics Introduction 147 6 Heapsort 151 6.1 Heaps 151 6.2 Maintaining the heap property 154 6.3 Building a heap 156 6.4 The heapsort algorithm 159 6.5 Priority queues 162 7 Quicksort 170 7.1 Description of quicksort 170 7.2 Performance of quicksort 174 7.3 A randomized version of quicksort 179 7.4 Analysis of quicksort 180 8 Sorting in Linear Time 191 8.1 Lower bounds for sorting 191 8.2 Counting sort 194 8.3 Radix sort 197 8.4 Bucket sort 200 9 Medians and Order Statistics 213 9.1 Minimum and maximum 214 9.2 Selection in expected linear time 215 9.3 Selection in worst-case linear time 220 III Data Structures Introduction 229 10 Elementary Data Structures 232 10.1 Stacks and queues 232 10.2 Linked lists 236 10.3 Implementing pointers and objects 241 10.4 Representing rooted trees 246 11 Hash Tables 253 11.1 Direct-address tables 254 11.2 Hash tables 256 11.3 Hash functions 262 11.4 Open addressing 269 11.5 Perfect hashing 277 12 Binary Search Trees 286 12.1 What is a binary search tree? 286 12.2 Querying a binary search tree 289 12.3 Insertion and deletion 294 12.4 Randomly built binary search trees 299 13 Red-Black Trees 308 13.1 Properties of red-black trees 308 13.2 Rotations 312 13.3 Insertion 315 13.4 Deletion 323 14 Augmenting Data Structures 339 14.1 Dynamic order statistics 339 14.2 How to augment a data structure 345 14.3 Interval trees 348 IV Advanced Design and Analysis Techniques Introduction 357 15 Dynamic Programming 359 15.1 Rod cutting 360 15.2 Matrix-chain multiplication 370 15.3 Elements of dynamic programming 378 15.4 Longest common subsequence 390 15.5 Optimal binary search trees 397 16 Greedy Algorithms 414 16.1 An activity-selection problem 415 16.2 Elements of the greedy strategy 423 16.3 Huffman codes 428 16.4 Matroids and greedy methods 437 16.5 A task-scheduling problem as a matroid 443 17 Amortized Analysis 451 17.1 Aggregate analysis 452 17.2 The accounting method 456 17.3 The potential method 459 17.4 Dynamic tables 463 V Advanced Data Structures Introduction 481 18 B-Trees 484 18.1 Definition of B-trees 488 18.2 Basic operations on B-trees 491 18.3 Deleting a key from a B-tree 499 19 Fibonacci Heaps 505 19.1 Structure of Fibonacci heaps 507 19.2 Mergeable-heap operations 510 19.3 Decreasing a key and deleting a node 518 19.4 Bounding the maximum degree 523 20 van Emde Boas Trees 531 20.1 Preliminary approaches 532 20.2 A recursive structure 536 20.3 The van Emde Boas tree 545 21 Data Structures for Disjoint Sets 561 21.1 Disjoint-set operations 561 21.2 Linked-list representation of disjoint sets 564 21.3 Disjoint-set forests 568 21.4 Analysis of union by rank with path compression 573 VI Graph Algorithms Introduction 587 22 Elementary Graph Algorithms 589 22.1 Representations of graphs 589 22.2 Breadth-first search 594 22.3 Depth-first search 603 22.4 Topological sort 612 22.5 Strongly connected components 615 23 Minimum Spanning Trees 624 23.1 Growing a minimum spanning tree 625 23.2 The algorithms of Kruskal and Prim 631 24 Single-Source Shortest Paths 643 24.1 The Bellman-Ford algorithm 651 24.2 Single-source shortest paths in directed acyclic graphs 655 24.3 Dijkstra's algorithm 658 24.4 Difference constraints and shortest paths 664 24.5 Proofs of shortest-paths properties 671 25 All-Pairs Shortest Paths 684 25.1 Shortest paths and matrix multiplication 686 25.2 The Floyd-Warshall algorithm 693 25.3 Johnson's algorithm for sparse graphs 700 26 Maximum Flow 708 26.1 Flow networks 709 26.2 The Ford-Fulkerson method 714 26.3 Maximum bipartite matching 732 26.4 Push-relabel algorithms 736 26.5 The relabel-to-front algorithm 748 VII Selected Topics Introduction 769 27 Multithreaded Algorithms Sample Chapter - Download PDF (317 KB) 772 27.1 The basics of dynamic multithreading 774 27.2 Multithreaded matrix multiplication 792 27.3 Multithreaded merge sort 797 28 Matrix Operations 813 28.1 Solving systems of linear equations 813 28.2 Inverting matrices 827 28.3 Symmetric positive-definite matrices and least-squares approximation 832 29 Linear Programming 843 29.1 Standard and slack forms 850 29.2 Formulating problems as linear programs 859 29.3 The simplex algorithm 864 29.4 Duality 879 29.5 The initial basic feasible solution 886 30 Polynomials and the FFT 898 30.1 Representing polynomials 900 30.2 The DFT and FFT 906 30.3 Efficient FFT implementations 915 31 Number-Theoretic Algorithms 926 31.1 Elementary number-theoretic notions 927 31.2 Greatest common divisor 933 31.3 Modular arithmetic 939 31.4 Solving modular linear equations 946 31.5 The Chinese remainder theorem 950 31.6 Powers of an element 954 31.7 The RSA public-key cryptosystem 958 31.8 Primality testing 965 31.9 Integer factorization 975 32 String Matching 985 32.1 The naive string-matching algorithm 988 32.2 The Rabin-Karp algorithm 990 32.3 String matching with finite automata 995 32.4 The Knuth-Morris-Pratt algorithm 1002 33 Computational Geometry 1014 33.1 Line-segment properties 1015 33.2 Determining whether any pair of segments intersects 1021 33.3 Finding the convex hull 1029 33.4 Finding the closest pair of points 1039 34 NP-Completeness 1048 34.1 Polynomial time 1053 34.2 Polynomial-time verification 1061 34.3 NP-completeness and reducibility 1067 34.4 NP-completeness proofs 1078 34.5 NP-complete problems 1086 35 Approximation Algorithms 1106 35.1 The vertex-cover problem 1108 35.2 The traveling-salesman problem 1111 35.3 The set-covering problem 1117 35.4 Randomization and linear programming 1123 35.5 The subset-sum problem 1128 VIII Appendix: Mathematical Background Introduction 1143 A Summations 1145 A.1 Summation formulas and properties 1145 A.2 Bounding summations 1149 B Sets, Etc. 1158 B.1 Sets 1158 B.2 Relations 1163 B.3 Functions 1166 B.4 Graphs 1168 B.5 Trees 1173 C Counting and Probability 1183 C.1 Counting 1183 C.2 Probability 1189 C.3 Discrete random variables 1196 C.4 The geometric and binomial distributions 1201 C.5 The tails of the binomial distribution 1208 D Matrices 1217 D.1 Matrices and matrix operations 1217 D.2 Basic matrix properties 122

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值