Arrays.sort()深入理解(一)

最新推荐文章于 2023-10-19 20:37:08 发布

peterLC

最新推荐文章于 2023-10-19 20:37:08 发布

阅读量332

点赞数

分类专栏：总结 Java 文章标签：排序算法算法数据结构源码 sort

本文链接：https://blog.csdn.net/passport404/article/details/122021305

版权

Java 同时被 2 个专栏收录

26 篇文章 0 订阅

订阅专栏

总结

24 篇文章 0 订阅

订阅专栏

用法

sort(T[] a)：对指定T型数组按数字升序排序。
sort(T[] a,int formIndex, int toIndex)：对指定T型数组的指定范围按数字升序排序。
sort(T[] a, Comparator<? supre T> c): 根据指定比较器产生的顺序对指定对象数组进行排序。
sort(T[] a, int formIndex, int toIndex, Comparator<? supre T> c): 根据指定比较器产生的顺序对指定对象数组的指定对象数组进行排序。

最近看到使用Arrays.sort的题,对它的源码产生了兴趣.

 public static <T> void sort(T[] a, Comparator<? super T> c) {
        if (c == null) {
            sort(a);
        } else {
            if (LegacyMergeSort.userRequested)
                legacyMergeSort(a, c);
            else
                TimSort.sort(a, 0, a.length, c, null, 0, 0);
        }
    }

点进来后是这样的,下面是注释,说在数组里的元素都是可以比较的,这个排序是稳定的,相等的元素不会再被排一遍.需要注意的是,它内部具体要实现哪种排序要根据你传入数组的特点来看,稳定的自适应性的迭代的归并需要远少于nlg(n)当已经部分有序时,若输入的数组基本无序,则使用传统的归并等等,下面不再赘述,有兴趣可以自己看看.

Sorts the specified array of objects according to the order induced by the specified comparator. All elements in the array must be mutually comparable by the specified comparator (that is, c.compare(e1, e2) must not throw a ClassCastException for any elements e1 and e2 in the array).
This sort is guaranteed to be stable: equal elements will not be reordered as a result of the sort.
Implementation note: This implementation is a stable, adaptive, iterative mergesort that requires far fewer than n lg(n) comparisons when the input array is partially sorted, while offering the performance of a traditional mergesort when the input array is randomly ordered. If the input array is nearly sorted, the implementation requires approximately n comparisons. Temporary storage requirements vary from a small constant for nearly sorted input arrays to n/2 object references for randomly ordered input arrays.
The implementation takes equal advantage of ascending and descending order in its input array, and can take advantage of ascending and descending order in different parts of the the same input array. It is well-suited to merging two or more sorted arrays: simply concatenate the arrays and sort the resulting array.
The implementation was adapted from Tim Peters’s list sort for Python ( TimSort ). It uses techniques from Peter McIlroy’s “Optimistic Sorting and Information Theoretic Complexity”, in Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pp 467-474, January 1993.

如果没有比较器调用下面的方法

public static void sort(Object[] a) {
        if (LegacyMergeSort.userRequested)//此处就是判断是否使用经典的归并排序(在jdk1.7以后默认的排序更改为了TimSort算法排序)。
            legacyMergeSort(a);
        else
            ComparableTimSort.sort(a, 0, a.length, null, 0, 0);
    }

传统归并

private static void mergeSort(Object[] src,
                                  Object[] dest,
                                  int low,
                                  int high,
                                  int off) {
        int length = high - low;

        // Insertion sort on smallest arrays
        //这里是对归并排序的一个优化，因为在对于少量数据的数组使用插入排序的效率更高。INSERTIONSORT_THRESHOLD是7,数组长度小于7使用插排.
        if (length < INSERTIONSORT_THRESHOLD) {
            for (int i=low; i<high; i++)
                for (int j=i; j>low &&
                         ((Comparable) dest[j-1]).compareTo(dest[j])>0; j--)
                    swap(dest, j, j-1);
            return;
        }

        // Recursively sort halves of dest into src
        int destLow  = low;
        int destHigh = high;
        low  += off;
        high += off;
        int mid = (low + high) >>> 1;
        mergeSort(dest, src, low, mid, -off);
        mergeSort(dest, src, mid, high, -off);

        // If list is already sorted, just copy from src to dest.  This is an
        // optimization that results in faster sorts for nearly ordered lists.
        //翻译一下,这是第二个优化,如果已经有序了,前面的小于后面的,直接复制到dest数组
        if (((Comparable)src[mid-1]).compareTo(src[mid]) <= 0) {
            System.arraycopy(src, low, dest, destLow, length);
            return;
        }

        // Merge sorted halves (now in src) into dest
        for(int i = destLow, p = low, q = mid; i < destHigh; i++) {
            if (q >= high || p < mid && ((Comparable)src[p]).compareTo(src[q])<=0)
                dest[i] = src[p++];
            else
                dest[i] = src[q++];
        }
    }

不使用传统的归并,则使用TimSort排序,它是一种特别高效的算法,在timsort中,主要是为待排序数组分为很多个run块,通过讲这些run块进行归并排序.最后实现总体排序.每个run块的大小为16-32大小.
优化:

当待排序数组长度小于32就使用二分排序算法
分为多个run块,在通过把run块的起始位置和长度压入栈中,在进行合并.
在找到一个run块的时候会首先判断数组中有序元素的个数.通过二分排序从第一个无序的元素开始排序,加快排序速度
在进行合并的时候会进行”去头”,”去尾”操作,是的归并操作加快速度.

/*
a – the array to be sorted
lo – the index of the first element, inclusive, to be sorted
hi – the index of the last element, exclusive, to be sorted
work – a workspace array (slice)
workBase – origin of usable space in work array
workLen – usable size of work array
*/

static void sort(Object[] a, int lo, int hi, Object[] work, int workBase, int workLen) {
        assert a != null && lo >= 0 && lo <= hi && hi <= a.length;

        int nRemaining  = hi - lo;
        //小于2不用排了
        if (nRemaining < 2)
            return;  // Arrays of size 0 and 1 are always sorted

        // If array is small, do a "mini-TimSort" with no merges
        if (nRemaining < MIN_MERGE) {
        	//小于这个值使用二分,这个值为32,第一个方法返回从lo开始有序的个数,下面优化二分要用.
            int initRunLen = countRunAndMakeAscending(a, lo, hi);
            //从第一个没有排好序的位置开始二分,加快速度.
            binarySort(a, lo, hi, lo + initRunLen);
            return;
        }

        /**
         * March over the array once, left to right, finding natural runs,
         * extending short natural runs to minRun elements, and merging runs
         * to maintain stack invariant.
         */
         //开始TimSort
        ComparableTimSort ts = new ComparableTimSort(a, work, workBase, workLen);
        //将数组分为一个个minRun,为2的n次幂,就是32,否则从16-32中选取
        int minRun = minRunLength(nRemaining);
        do {
            // Identify next run
            //再找已经有序的个数
            int runLen = countRunAndMakeAscending(a, lo, hi);

            // If run is short, extend to min(minRun, nRemaining)
            if (runLen < minRun) {
                int force = nRemaining <= minRun ? nRemaining : minRun;
                binarySort(a, lo, lo + force, lo + runLen);
                runLen = force;
            }

          
            //把这个run块压进run栈中,再进行合并操作
            ts.pushRun(lo, runLen);
            ts.mergeCollapse();

            // Advance to find next run
            lo += runLen;
            nRemaining -= runLen;
        } while (nRemaining != 0);

        // Merge all remaining runs to complete sort
        assert lo == hi;
        ts.mergeForceCollapse();
        assert ts.stackSize == 1;
    }

private static int countRunAndMakeAscending(Object[] a, int lo, int hi) {
        assert lo < hi;
        int runHi = lo + 1;
        if (runHi == hi)
            return 1;

        // 找a从lo开始有序的个数
        if (((Comparable) a[runHi++]).compareTo(a[lo]) < 0) { // Descending
            while (runHi < hi && ((Comparable) a[runHi]).compareTo(a[runHi - 1]) < 0)
                runHi++;
            //逆序反转
            reverseRange(a, lo, runHi);
        } else {                              // Ascending
            while (runHi < hi && ((Comparable) a[runHi]).compareTo(a[runHi - 1]) >= 0)
                runHi++;
        }

        return runHi - lo;
    }

private void mergeCollapse() {
        //stacksize<=1没有意义
        while (stackSize > 1) {
            int n = stackSize - 2;//倒数第二个
            if (n > 0 && runLen[n-1] <= runLen[n] + runLen[n+1]) {
                //如果倒数第三个块比倒一块的大小小,就从倒数第三个块进行合并
                if (runLen[n - 1] < runLen[n + 1])
                    n--;
                mergeAt(n);
            } else if (runLen[n] <= runLen[n + 1]) {//倒2小于倒1,合并这两个
                mergeAt(n);
            } else { 
                break; // Invariant is established
            }
        }
    }

mergeAt(i)方法主要是合并在run块数组中从i开始的两个块.run[i]和run[i+1]

private void mergeAt(int i) {
        assert stackSize >= 2;
        assert i >= 0;
        assert i == stackSize - 2 || i == stackSize - 3;

        int base1 = runBase[i];
        int len1 = runLen[i];
        int base2 = runBase[i + 1];
        int len2 = runLen[i + 1];
        assert len1 > 0 && len2 > 0;
        assert base1 + len1 == base2;

        /*
        	记录连接的run的长度,如果i是倒3,倒数第一个块赋值给倒数第二个块
         */
        runLen[i] = len1 + len2;
        if (i == stackSize - 3) {
            runBase[i + 1] = runBase[i + 2];
            runLen[i + 1] = runLen[i + 2];
        }
        stackSize--;

        /*
        找到run2第一个元素能插入run1的位置,如果是run1的最后的位置,run1先前的元素可以忽略,因为他们已经小于run2的元素了.
         */
        int k = gallopRight((Comparable<Object>) a[base2], a, base1, len1, 0);
        assert k >= 0;
        base1 += k;
        len1 -= k;
        if (len1 == 0)
            return;

        /*
        找到run1最后一个元素可以插入run2的,实现去尾,run2随后的元素可以忽略了
         */
        len2 = gallopLeft((Comparable<Object>) a[base1 + len1 - 1], a,
                base2, len2, len2 - 1);
        assert len2 >= 0;
        if (len2 == 0)
            return;

        // 经过上面的“去头”和“去尾”之后，run1的开始元素一定大于run2的开始元素，并且run1的最后一个数据一定大于run2的最后一个数据然后进行合并,通过两个len的大小找到最好的合并方式
        if (len1 <= len2)
            mergeLo(base1, len1, base2, len2);
        else
            mergeHi(base1, len1, base2, len2);
    }

可以结合这篇文章动图加深理解Timsort

peterLC

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Arrays.sort()深入理解(一)

用法sort(T[] a)：对指定T型数组按数字升序排序。sort(T[] a,int formIndex, int toIndex)：对指定T型数组的指定范围按数字升序排序。sort(T[] a, Comparator<? supre T> c): 根据指定比较器产生的顺序对指定对象数组进行排序。sort(T[] a, int formIndex, int toIndex, Comparator<? supre T> c): 根据指定比较器产生的顺序对指定对象数组的指定对
复制链接

扫一扫