排序笔记

最新推荐文章于 2020-08-12 19:23:26 发布

fwu11

最新推荐文章于 2020-08-12 19:23:26 发布

阅读量118

点赞数

分类专栏： Data Structure & Algorithm笔记

本文链接：https://blog.csdn.net/weixin_42552135/article/details/83672809

版权

Data Structure & Algorithm笔记专栏收录该内容

2 篇文章 0 订阅

订阅专栏

Sort Algorithms
	Name	Time Complexity			Space Complexity	Stable	Type
	Name	Average	Worst	Best	Space Complexity	Stable	Type
1	Bubble Sort			O(n)	O(1)	Yes	CBA
2	Insertion Sort			O(n)	O(1)	Yes
3	Selection Sort				O(1)	No
4	Quick Sort	O(nlogn)		O(nlogn)	O(logn)	No
5	Merge Sort	O(nlogn)	O(nlogn)	O(nlogn)	O(n)	Yes
6	Heap Sort	O(nlogn)	O(nlogn)	O(nlogn)	O(1)	No
7	Counting Sort	O(n+k)	O(n+k)	O(n+k)	O(n+k)	Yes
8	Bucket Sort	O(n+k)	O(n+k)		O(n)	Yes
9	Radix Sort	O(n+k)	O(n+k)	O(n+k)	O(n+k)	Yes

排序算法

内部排序算法和外部排序算法。前者处理的数据规模相对不大，内存足以容纳；后者处理的数据规模很大，必须将借助外部甚至分布式存储器，在排序计算过程的任一时刻，内存中只能容纳其中一小部分数据。

离线算法和在线算法。前一情况下，待排序的数据以批处理的形式整体给出；而在网络计算之类的环境中，待排序的数据通常需要实时生成，在排序算法启动后数据才陆续到达。

基于比较式算法（comparison-based algorithm)。算法所有可能的执行过程都可表示和概括为一棵比较树。除散列之外的算法大多属于此类。复杂度下界等于比较树的最小高度，与叶节点（可能的输出结果）的数目有关。最坏情况下CBA式排序方法至少需要O(nlogn)，n为待排序元素数目。

不属于此类的有桶排序和基数排序法。

稳定性：算法对重复元素的处理效果。重复元素之间的相对顺序在排序前后保持一致。

1. Bubble sort

有序 A[i-1] < A[i],从局部有序到整体有序。每一轮，通过单趟扫描交换算法，末元素必然就位。

void bubbleSort(vector<int>& A){
    int n = A.size();
    //共n次交换
    for(int i = 0; i < n; i++){
        //开始单趟扫描交换, 每一次末元素必然就位
        for(int j = 1; j < n-i; j++){
            if(A[j-1] > A[j]){
                swap(A[j-1],A[j]); //通过交换使局部有序
            }
        }
    }
}

2. Insertion Sort

像扑克牌理牌

当前节点 r, 有序的前缀[0,r)，无序的后缀[r,end]。将后缀首元素转移至前缀中。

void insertionSort(vector<int> arr){ 
    int n = arr.size();
    int i, key, j; 
    for (i = 1; i < n; i++){ 
        key = arr[i]; 
        j = i-1; 
  
        /* Move elements of arr[0..i-1], that are 
          greater than key, to one position right to
          their current position */
        while (j >= 0 && arr[j] > key){ 
            arr[j+1] = arr[j]; 
            j--; 
        } 
        arr[j+1] = key; 
    } 
}

Boundary Cases: Insertion sort takes maximum time to sort if elements are sorted in reverse order. And it takes minimum time (Order of n) when elements are already sorted.

Online: Yes

Uses: Insertion sort is used when number of elements is small. It can also be useful when input array is almost sorted, only few elements are misplaced in complete big array.

3. Selection sort

The selection sort algorithm sorts an array by repeatedly finding the minimum element (considering ascending order) from unsorted part and putting it at the beginning. The algorithm maintains two subarrays in a given array.

1) The subarray which is already sorted.
2) Remaining subarray which is unsorted.

In every iteration of selection sort, the minimum element (considering ascending order) from the unsorted subarray is picked and moved to the sorted subarray.

void selectionSort(vector<int> arr, int n){ 
    int i, j, min_idx; 
  
    // One by one move boundary of unsorted subarray 
    for (i = 0; i < n-1; i++){ 
        // Find the minimum element in unsorted array 
        min_idx = i; 
        for (j = i+1; j < n; j++) 
          if (arr[j] < arr[min_idx]) 
            min_idx = j; 
  
        // Swap the found minimum element with the first element 
        swap(arr[min_idx], arr[i]); 
    } 
}

The good thing about selection sort is it never makes more than O(n) swaps and can be useful when memory write is a costly operation.

4. Quick Sort

void quickSort(vector<int> &A, int left, int right){
    if(left >= right){
        return;
    }

    // partition
    int start = left;
    int end = right;
    int pivot = A[(left+right)/2];
    
    // use start <= end is to include the case when k < smallest or k > largest
    while(start <= end){
        while(start <= end && A[start] < pivot){
            start++;
        }
        while(start <= end && A[end] > pivot){
            end--;
        }
        if(start <= end){
            swap(A[start++],A[end--]);   
        }
    }
    
    //divide and conquer
    quickSort(A, left, end); 
    quickSort(A, start, right); 
}

5. Merge Sort

通过反复调用二路归并算法（2-way merge) 实现。将两个有序序列合并成为一个有序序列。二路归并属于迭代式算法。每步迭代中，只需比较两个待归并向量的首元素，将小者去除并追加到输出向量的末尾。该元素在原向量中的后继则成为新的首元素。直到某一向量为空，最后将另一非空的向量整体接至输出向量的末尾。Merge Sort 的实现使用divide and conquer。无序向量的递归分解，有序向量的逐层归并。每次二路归并时间成本O(n)。根据主方法，T(n) = 2T(n/2)+O(n), 归并算法时间复杂度稳定在O(nlogn)。归并算法既可以用于数组，也可以用于链表。

void mergeSort(vector<int> &A, int l, int r, vector<int>& tmp){
    if(l >= r) return;
    int mid = l+(r-l)/2;
    mergeSort(A,l,mid,tmp);
    mergeSort(A,mid+1,r,tmp);
    merge(A,l,mid,r,tmp);
}
    
void merge(vector<int> &A, int start, int mid, int end, vector<int>& tmp){
    int l = start;
    int r = mid + 1;
    int index = start;
        
    while(l <= mid && r <= end){
        if(A[l] < A[r]){
            tmp[index++] = A[l++];
        }else{
            tmp[index++] = A[r++];
        }
    }
        
    while(l <= mid) tmp[index++] = A[l++]; 
    while(r <= end) tmp[index++] = A[r++]; 
    for (index = start; index <= end; index++) {
        A[index] = tmp[index];
    }
        
}

6. Heap Sort

Heap sort is a comparison based sorting technique based on Binary Heap data structure. It is similar to selection sort where we first find the maximum element and place the maximum element at the end. We repeat the same process for remaining element.

A Binary Heap is a Complete Binary Tree where items are stored in a special order such that value in a parent node is greater(or smaller) than the values in its two children nodes. The former is called as max heap and the latter is called min heap. The heap can be represented by binary tree or array.

Why array based representation for Binary Heap?
Since a Binary Heap is a Complete Binary Tree, it can be easily represented as array and array based representation is space efficient. If the parent node is stored at index I, the left child can be calculated by 2 * I + 1 and right child by 2 * I + 2 (assuming the indexing starts at 0), its father is at (i-1)/2.

Heap Sort Algorithm for sorting in increasing order:
1. Build a max heap from the input data.
2. At this point, the largest item is stored at the root of the heap. Replace it with the last item of the heap followed by reducing the size of heap by 1. Finally, heapify the root of tree.
3. Repeat above steps while size of heap is greater than 1.

// To heapify a subtree rooted with node i which is an index in arr[]. 
// n is size of heap 
void heapify(vector<int>& A, int n, int i){
    //iterative to the end of the size of the heap
    while(i < n){
        int father_index= i; // Initialize largest as root 
        int l = 2*i + 1; // left = 2*i + 1 
        int r = 2*i + 2; // right = 2*i + 2 
  
        // If left child is larger than root 
        if (l < n && A[l] > A[father_index]) 
            father_index= l; 
  
        // If right child is larger than largest so far 
        if (r < n && A[r] > A[father_index]) 
            father_index= r; 
  
        // If largest is not root 
        if (father_index== i) break;

        swap(A[i], A[father_index]); 
        i = father_index;
    }
 
} 

// main function to do heap sort 
void heapSort(vector<int> &A){
    int n = A.size();

    // Build heap (rearrange array) 
    for (int i = n / 2 - 1; i >= 0; i--) 
        heapify(A, n, i); 
  
    // One by one extract an element from heap 
    for (int i=n-1; i>=0; i--){ 

        // Move current root to end 
        swap(A[0], A[i]); 
  
        // call max heapify on the reduced heap 
        heapify(A, i, 0); 
    } 
}

Time Complexity: Time complexity of heapify is O(Logn). Time complexity of createAndBuildHeap() is O(n) and overall time complexity of Heap Sort is O(nLogn).

7. Counting sort

Counting sort assumes that each of the n input elements is an integer in the range 0 to k, for some integer k. Counting sort determines, for each input element x, the number of elements less than x. It uses this information to place element x directly into its position in the output array. 计数排序只适用于正整数并且取值范围相差不大的数组排序使用

vector<int> countSort(vector<int> arr, int k) { 

    int n = arr.size();
    vector<int> output;
    vector<int> count(k+1,0); 
  
    // Store count of each integer
    for(i = 0; i < n; ++i) 
        ++count[arr[i]]; 
  
    // Change count[i] so that count[i] now contains actual 
    // position of this character in output array 
    for (i = 1; i <= k; i++) 
        count[i] += count[i-1]; 
  
    // Build the output character array from the back to achieve the stable sort
    for (i = n - 1; i >= 0; i--){ 
        output[count[arr[i]]-1] = arr[i]; 
        count[arr[i]]--; 
    } 

    return output;
}

Time Complexity: O(n+k) where n is the number of elements in input array and k is the range of input.
Auxiliary Space: O(n+k)

8. Bucket Sort

意义在于针对数值类型和取值范围特定的这一具体问题，可以突破CBA式排序算法最坏情况下时间下界。

采用call by rank。

将数组分到有限数量的桶子里，然后对每个桶子再分别排序（有可能再使用别的排序算法或是以递归方式继续使用桶排序进行排序），最后将各个桶中的数据有序的合并起来。

在额外空间充足的情况下，尽量增大桶的数量，极限情况下每个桶只有一个数据时，或者是每只桶只装一个值时，完全避开了桶内排序的操作，桶排序的最好时间复杂度就能够达到 O(n)。

但是如果数据经过桶的划分之后，桶与桶的数据分布极不均匀，有些数据非常多，有些数据非常少，会使时间复杂度下降到 O(nlogn)

Bucket sort divides the interval [0,1) into n equal-sized subintervals, or buckets, and then distributes the n input numbers into the buckets. Since the inputs are uniformly and independently distributed over [0,1), we do not expect many numbers
to fall into each bucket. To produce the output, we simply sort the numbers in each bucket and then go through the buckets in order, listing the elements in each.

The most common variant of bucket sort operates on a list of n numeric inputs between zero and some maximum value M and divides the value range into n buckets each of size M/n. If each bucket is sorted using insertion sort, the sort can be shown to run in expected linear time (where the average is taken over all possible inputs).However, the performance of this sort degrades with clustering; if many values occur close together, they will all fall into a single bucket and be sorted slowly. This performance degradation is avoided in the original bucket sort algorithm by assuming that the input is generated by a random process that distributes elements uniformly over the interval [0,1).

bucketSort(arr[], n)
1) Create n empty buckets (Or lists).
2) Do following for every array element arr[i].
.......a) Insert arr[i] into bucket[n*array[i]]
3) Sort individual buckets using insertion sort.
4) Concatenate all sorted buckets.

// Function to sort arr[] of size n using bucket sort 
void bucketSort(vector<double>arr, int n) 
{ 
    // 1) Create n empty buckets 
    vector<double> b[n]; 
     
    // 2) Put array elements in different buckets 
    for (int i=0; i<n; i++) 
    { 
       int bi = n*arr[i]; // Index in bucket 
       b[bi].push_back(arr[i]); 
    } 
  
    // 3) Sort individual buckets 
    for (int i=0; i<n; i++) 
       sort(b[i].begin(), b[i].end()); 
  
    // 4) Concatenate all buckets into arr[] 
    int index = 0; 
    for (int i = 0; i < n; i++) 
        for (int j = 0; j < b[i].size(); j++) 
          arr[index++] = b[i][j]; 
}

Time Complexity: O(n+k) where n is the number of elements in input array and k is the range of input.

9. Radix Sort

key未必是证书。key可以由多个字段组合，并采用某种排序方式（如字典序）确定大小次序。假设key由t个字段组成{k_t,k_t-1,...,k_1},其中字段有不同优先级。只需按照优先级递增的次序，针对每个字段各做一趟桶排序，即可实现按整个关键码字典序的排序。

基数排序是一种非比较型整数排序算法，其原理是将数据按位数切割成不同的数字，然后按每个位数分别比较。
假设说，我们要对 100 万个手机号码进行排序，应该选择什么排序算法呢？排的快的有归并、快排时间复杂度是 O(nlogn)，计数排序和桶排序虽然更快一些，但是手机号码位数是11位，那得需要多少桶？

其实它的思想很简单，不管你的数字有多大，按照一位一位的排，0 - 9 最多也就十个桶：先按权重小的位置排序，然后按权重大的位置排序。

void radix_sort(vector<int> arr){
    int n = arr.size();
    //最大值
    int max = *max_element(arr.begin(),arr.end());
    //当前排序位数（最右）
    int location = 0;
    //桶列表 长度为10 装入余数0-9的数据
    vector<vector<int>> bucketList(10,vector<int>());

    while(true){
        //判断是否排完
        int dd = (int)pow(10，location);
        if(max < dd){
            break;
        }
        //数据入桶
        for(int i = 0; i < n; i++){
            //计算余数 放入相应的桶
            int number = ((arr[i] / dd) % 10);
            bucketList[number].push_back(arr[i]);
        }
        //写回数组
        int nn = 0;
        for (int i = 0; i < 10; i++){
            int size = bucketList[i].size();
            for(int j = 0; j < size; j ++){
                arr[nn++] = bucketList[i][j];
            }
            //清空桶
            bucketList[i].clear();
        }
        location++;
    }
}

fwu11

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
排序笔记

Sort Algorithms Name Time Complexity Space Complexity Stable Type Average Worst Best 1 Bubble Sort O(n) O(1) Yes CBA 2 Insertion Sort ...
复制链接

扫一扫

专栏目录