!!!Chapter 7 Sorting

7.2 Insertion Sort

void InsertionSort( ElementType A[], int N )
{
    int j,p;
    ElementType Tmp;
    for( p = 1; p < N; p++ )
    {
        Tmp = A[p];
        // we compare A[j] & A[j-1], so j>0
        for( j=p; j>0 && A[j-1] > Tmp; j--)
            A[j] = A[j-1]; 
        A[j] = Tmp;
    }
}
Time complexity: O(N*N)

7.3 A Lower Bound for Simple Sorting Algorithms

An inversion in an array of numbers is any ordered pair(i, j) having the property that i < j but A[i] > A[j].

Theorem: The average number of inversions in an array of N distinct number is N(N - 1)/4

Theorem: Any algorithm that sorts by exchanging adjacent elements requires Ω(N*N) time on average.

7.4 Shellsort

Shellsort is sometimes referred to as diminishing increment sort.

Shellsort uses a sequence, h1, h2, ..., ht, called increment sequence. Any increment sequence will do as long as h1 = 1.

After a phase, using some increment hk, for every i, we have A[i] <= A[i+hk]; all elements spaced hk apart are sorted. The file is said to be hk-sorted.

An important property of Shellsort is that an hk-sorted file is then h(k-1) sorted remains hk-sorted.

The action of an hk-sort is to perform an insertion sort on hk independent subarrays.


A popular(but poor) choice for increment sequence is: ht = [N/2], and hk = [h(k+1)/2].

void Shellsort( ElementType A[], int N )
{
    int i,j,Increment;
    ElementType Tmp;
    
    for(Increment = N/2; Increment > 0; Increment /=2 )
        // insert sort
        for( i = Increment; i < N; i++)
        {
            Tmp = A[i];
            for( j=i; j>=Increment; j-=Increment )
                if(Tmp < A[j-Increment])
                    A[j] = A[j-Increment];
            A[j] = Tmp;
        }
}

7.4.1 Worst-Case Analysis of Shellsort

Theorem: The worst-case running time of Shellsort, using Shell's increment(ht = [N/2], and hk = [h(k+1)/2]), isΘ(N^2)

Hibbard's increments: 1, 3, 7, ..., 2^k - 1. This increment's consecutive increments have no common factors.

Theorem: The worst-case running time of Shellsort suing Hibbard's increments isΘ(N^(3/2))

7.5 Heapsort

Build a binary heap of N elements takes O(N). (Building a heap:http://en.wikipedia.org/wiki/Binary_heap)

We can perform N consecutive DeleteMin to get a sorted array. So the total running time isO(N*logN).

The main problem of heapsort is the use of an extra array, which requires extra spaces.

A clever way to avoid using a second array makes use of the fact that after each DeleteMin, the heap shrinks by 1. So we can store the deleted element at the end of the array. (To get a array in increasing order, we should useDeleteMax, and the put the first element at A[N], the second at A[N-1]...)

//max Heap implementation
#define LeftChild(i) (2*(i) + 1)  // array starts with 0

// N is the current size, i is the root element
void PercDown( ElementType A[], int i, int N)
{
    int Child;
    ElementType Tmp;
    for( Tmp = A[i]; LeftChild(i) < N; i = Child )
    {
        Child = LeftChild(i);
        // max heap, so switch with larger child
        // compare with A[Child+1], so LeftChild(i) must be smaller than N
        if( Child != N-1 && A[ Child + 1] > A[Child])
            Child++;
        if(Tmp < A[Child])
            A[i] = A[Child];
        else
            break;
    }
    A[i] = Tmp;
}

void Heapsort(ElementType A[], int N )  //N is the # of elements
{
    int i;
    // 从第一个有child的node开始PercDown
    for( i = N/2; i>=0; i-- )    //build Heap
        PercDown(A, i, N);
    for( i = N-1; i>0; i-- )     //最后一个元素不需要PercDown
    {
        swap(&A[0], &A[i]);
        PercDown(A, 0, i);
    }
}
Binary Heap:

  • The first element is A[0]: Left child is 2i +1, Right child is 2i + 2
  • The first element is A[1]: Left child is 2i, Right child is 2i + 1

7.6 Mergesort

Mergesort runs in O(N*LogN) worst-case running time. The required space isO(N)

The fundamental operation in this algorithm is merging two sorted lists. The time to merge two sorted lists is clearly linear, because at most N - 1 comparisons are made.

This algorithm is a classic divide-and-conquer strategy.

Mergesort routine:

void MSort( ElementType A[], ElementType TmpArray[], int Left, int Right )
{
    int Center;
    if( Left < Right )
    {
        Center = (Left + Right)/2;
        MSort(A, TmpArray, Left, Center);
        MSort(A, TmpArray, Center + 1, Right);
        Merge(A, TmpArray, Left, Center +1, Right);
    }
}

void Mergesort( ElementType A[], int N )
{
    ElementType *TmpArray;
    TmpArray = malloc( N * sizeof(ElementType) );
    if( TmpArray != NULL )
    {
        MSort(A, TmpArray, 0, N - 1 );
        free( TmpArray );
    }
    else
        FatalError( "No space for tmp array!!!" );
}
merge routine:

// Lpos = start of left half, Rpos = start of right half
void Merge( ElementType A[], ElementType TmpArray[], int Lpos, int Rpos, int RightEnd)
{
    int i, Leftend, NumElements, TmpPos;
    LeftEnd = Rpos - 1;
    TmpPos = Lpos;
    NumElements = RightEnd - Lpos + 1; // total # need to be merged

    // main loop
    while( Lpos <= LeftEnd && Rpos <= RightEnd )
        if(A[Lpos] <= A[Rpos])
            TmpArray[TmpPos++] = A[Lpos++];
        else
            TmpArray[TmpPos++] = A[Rpos++];

    // copy rest of first/second half
    while (Lpos <= LeftEnd)
        TmpArray[TmpPos++] = A[Lpos++];
    while (Rpos <= RightEnd)
        TmpArray[TmpPos++] = A[Rpos++];

    // Copy TmpArray back
    for( i = 0; i < NumElements; i++, RightEnd-- )
        A[RightEnd] = TmpArray[RightEnd];
}

7.7 Quicksort

As its name implies, quicksort is the fastest known sorting algorithm in practice.

It's average running time is O(N*logN), It hasO(N^2) worst case performance, but this can be made exponentially unlikely.

Like mergesort, quicksort is a divide-and-conquer recursive algorithm.

The basic algorithm to sort an array S consists of the following four steps:

1. If the number of elements in S is 0 or 1, then return.

2. Pick any element v in S, This is called pivot(中枢,支点).

3. Partition S - {v} (The remaining elements in S) into two disjoint groups: S1 = {x∈S - {v}| x≤v}, and S2 = {x∈S-{v}|x≥v}.

4. Return { quicksort(S1) followed by v followed by quicksort(S2) }.

The reason why quicksort is faster then mergesort is because the partitioning step can actually be performed in place and very efficiently.

7.7.1 Picking the Pivot

A wrong way

One choice is to use the first element as the pivot. This is acceptable if the input is random.

- If the input is presorted or in reverse order, then the pivot provides poor partitions.

A safe maneuver

An other way is to choose the pivot randomly.

- Random number generation is generally an expensive commodity and does not reduce the average running time of the rest of the algorithm at all

Median-of-Three Partitioning

The median of a group of N numbers is the [N/2] largest number, which is the best pivot. A good estimate can be obtained by picking three elements randomly and using the median of these three as pivot.

The randomness turns out not to help much, so the common course is to use as pivot the median of the left, right, and center elements.

7.7.2 Partitioning Strategy

1. Get the pivot element our of the way by swapping it with the last element.

2. While i is to the left of j, we move i right, skipping over elements that are smaller than the pivot. We move j left, skipping over elements that are larger than the pivot. When i and j have stopped, if i is to the left of j, those elements are swapped.

3. The final part of the partitioning is to swap the pivot element with the element pointed to by i.

We should stop i and j when they see a key equal to the pivot. And we should increase i, j after the swap to avoid infinite loop: swap(a[i++], a[j--]);

If we stop i/j, the total running time will be O(N logN) when all the elements are the same. Otherwise, the running time will beO(N^2)

7.7.3 Small Arrays

For very small arrays, quicksort does not perform as well as inserting sort. A good cutoff range is N = 10.

7.7.4 Actual Quicksort Routines

When select the pivot, we sort A[Left], A[Right] and A[Center]. We will put the smallest on A[Left], the largest on A[Right], and switch A[Center] with A[Right - 1]:

+ We will have sentinels on A[Left] and A[Right], so we don't need to worry out of boundary.

+ i will start with A[Left + 1], j will start with A[Right - 2]

Driver for quicksort:

void Quicksort( ElementType A[], int N )
{
    Qsort(A, 0, N-1);
}

Code to perform median-of-three partitioning:

// return median of Left, Center, Right and hide the pivot
ElementType Median3( ElementType A[], int Left, int Right)
{
    int Center = (Left + Right)/2;
    
    if(A[Left] > A[Center])
        swap(&A[Left], &A[Center]);
    if(A[Left] > A[Right])
        swap(&A[Left], &A[Right]);
    if(A[Center] > A[Right])
        swap(&A[Center], &A[Right]);

swap(&A[Center], &A[Right - 1]); // hide pivot
return A[Right - 1];
}

Mainquicksort routine:

#define Cutoff(3)    //can be 3 ~ 20
void Qsort( ElementType A[], int Left, int Right)
{
    int i,j;
    ElementType Pivot;
    
    if( Left + Cutoff <= Right)
    {
        Pivot = Median3(A, left, Right);
        i = Left; j = Right -1;
        for(;;)
        {
        // everytime, we will ++i and --j, even A[i] = Pivot
        // so we won't have infinite loop
            while(A[++i]<Pivot) {}   //start from Left + 1
            while(A[--j]<Pivot) {}   //start from Right - 2
            if(i < j)
                swap(&A[i], A[j]);   // swap should be inline function
            else
                break;
        }
        swap(&A[i], &A[Right-1]);    //restore pivot

        Qsort(A, Left, i - 1);
        Qsort(A, i+1, Right);
    }
    else  // do insertion sort on the subarray
        InsertionSort(A+Left, Right - Left + 1);
}

A wrong way to implement quicksort:

i = Left + 1; j = Right -2;
for(;;)
{
// we won't do i++/j-- everytime in the for loop
// so we will have infinite loop when A[i]=A[j]=Pivot
    while(A[i]<Pivot) i++;
    while(A[j]>Pivot) j--;
    if(i < j)
        swap(&A[i], &A[j]);
    else
        break;
}

7.7.6 A Linear-Expected-Time Algorithm for Selection

Tofind kth largest element in an array, if we use priority queue(heap sort), wecan find it inO(N+klogN)

Ifwe use quickselect, which is similarto quicksort. The average running time isO(N)and the worst case isO(N^2)

Stepsof quicksort:

1.If |S| = 1, then k = 1 and return the element in S as the answer. If a cutofffor small arrays is used, then sort S and return kth smallest element.

2.Pick a pivot element, v∈S.

3.Partition S-{v} into S1 and S2, as was done with quicksort.

4.If k≤|S1|, return quickselect (S1, k). if k>|S1|, return (S2, k-|S1|-1)

In contrast to quicksort, quickselect makes only one recursive call instead of two.

// Choose the kth smallest element, as array starts with 0,
// we will select k-1
void Qselect( ElementType A[], int k, int Left, int Right )
{
    int i,j;
    ElementType Pivot;
    if( Left + Cutoff <= Right )
    {
        Pivot = Median3(A, Left, Right);
        i = Left; j = Right -1;   // hide pivot in Median3
        for(;;)
        {
            while(A[++i] < Pivot) {}
            while(A[--j] < Pivot) {}
            if( i<j )
                Swap(&A[i], &A[j]);
            else
                break;
        }
        Swap( &A[i], &A[Right - 1]);    // restore Pivot
        // we always compare k with i, which is the Left side of the subarray
        if( k <= i )
            Qselect(A, k, Left, i-1);
        else if( k > i+1 )
            Qselect(A, k, i+1, Right);
    }
    else   // do insertion sort on subarray
        InsertionSort( A+Left, Right - Left + 1);
}

7.9 A General Lower Bound for Sorting

Anyalgorithm for sorting that uses only comparisons requires ΩNlogNcomparisonsin the worst case, so that mergesort and heapsort are optimal to within aconstant factor.

7.9.1Decision Trees

Adecision tree is an abstraction usedto prove lower bounds.

Inour context, a decision tree is a binary tree, each node(state) represents aset of possible orderings, consistent with comparisons that have been made,among the elements. The results of the comparisons are the tree edges.



Different algorithm will have different decision tree.

Thenumber of comparisons used by the sorting algorithm is equal to the depth ofthe deepest leaf. The average number of comparisons used is equal to theaverage depth of the leaves. 

Lemma7.1: Let T be a binary tree of depth d.Then T has at most2^d leaves.

Lemma7.2: A binary tree with L leaves musthave depth at least[log L]. (fromLemma 7.1)

Theorem7.6: Any sorting algorithm that usesonly comparisons between elements requires at least(log(N!))comparisons in the worst case.

   proof: a decision tree to sort N elements must have N! leaves. (N个元素随意排序的组合: N!)

Theorem7.7: Any sorting algorithm that uses onlycomparisons between elements requiresΩNlogN. (log(N!)=ΩNlogN)














  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值