【JDK1.8源码阅读】TimSort

4 篇文章 0 订阅
1 篇文章 0 订阅

介绍

  • TimSort是一个稳定、自适应、迭代的归并排序算法实现。
  • TimSort是归并排序的优化版,它比传统归并排序中触发的归并次数要少
  • Java版的实现是一个类,并直接用在了Arrays工具类中,用于对象排序,该类是改编自Tim Peters的Python集合排序算法实现。想详细了解,请查阅以下链接:
    Python_link
    C_link

源码阅读

package java.util;
/**
 * 这个类里的排序方法是静态的,但是权限是默认的,所以我们自己想用一般就要创建一个实例调用。
 * TimSort中有一些优化,它针对不同的数组长度,会使用不同的算法,只有输入的数组长度足够长
 * 的时候,才能完全使用TimSort算法。而如果长度不够,它会执行二分插排排序来处理。
 *
 * @author Josh Bloch
 */
class TimSort<T> {
    /**
     * 最小会触发归并的序列长度。小于该值的序列将执行binarySort(没有执行任何归并)。
     *
     * 这个常量必须是2的倍数。 Tim Peter's 在C语言中的实现使用的是64,而在此使用32性能会更好些。
     * 后面的是自定义该参数要同步修改哪些地方的提示,这里省略。
     */
    private static final int MIN_MERGE = 32;

    /**
     * 要排序的数组
     */
    private final T[] a;

    /**
     * 用于排序的集合
     */
    private final Comparator<? super T> c;

    /**
     * When we get into galloping mode, we stay there until both runs win less
     * often than MIN_GALLOP consecutive times.
     */
    private static final int  MIN_GALLOP = 7;

    /**
     * This controls when we get *into* galloping mode.  It is initialized
     * to MIN_GALLOP.  The mergeLo and mergeHi methods nudge it higher for
     * random data, and lower for highly structured data.
     */
    private int minGallop = MIN_GALLOP;

    /**
     *最大初始化用于归并的临时数组大小,数组可以按需扩大。
     *
     * 跟Tim 的C版本不一样,当排序数组比较小的时候,我们没有分配那么大的存储空间。
     * 这一更改是为了性能。
     */
    private static final int INITIAL_TMP_STORAGE_LENGTH = 256;

    /**
     * 用于归并时的临时存储数组。可选择在构造中提供该工作空间数组,使用多少就分配多大。
     */
    private T[] tmp;
    private int tmpBase; // base of tmp array slice
    private int tmpLen;  // length of tmp array slice

    /**
     * 待归并的挂起run中的堆栈大小(此处用run作为别名)。这个挂起的堆栈是一个虚拟的堆栈。
     * 记录这个run堆栈的参数有以下3个变量。
     * 1) stackSize 记录堆栈中待归并run部分的个数
     * 2) runBase 是一个数组,记录着待排序数组(a)中每一个排好序待归并部分的起始位置
     * 3) runLen 是一个数组,记录着待排序数组(a)中每一个排好序待归并部分的长度(元素个数)
     * Run i starts at
     * address base[i] and extends for len[i] elements.  It's always
     * true (so long as the indices are in bounds) that:
     *
     *     runBase[i] + runLen[i] == runBase[i + 1]
     *
     * so we could cut the storage for this, but it's a minor amount,
     * and keeping all the info explicit simplifies the code.
     */
    private int stackSize = 0;  // Number of pending runs on stack
    private final int[] runBase;
    private final int[] runLen;

    /**
     * 创建一个TimSort实例来维护正在进行的排序的状态
     *
     * @param a the array to be sorted 待排序对象数组
     * @param c the comparator to determine the order of the sort 对象的Comparetor对象
     * @param work a workspace array (slice) 工作空间
     * @param workBase origin of usable space in work array 工作空间中可用空间原点的指针位置
     * @param workLen usable size of work array 工作空间中可用空间大小
     */
    private TimSort(T[] a, Comparator<? super T> c, T[] work, int workBase, int workLen) {
        this.a = a;   
        this.c = c;

        // 分配临时存储空间,如果需要,后面增加它的大小
        int len = a.length;
        /* 分隔临时空间的基数,如果待排序数组的大小小于512,则直接指定分割基数为length/2,
         * 如果待排序数组的大小大于等于512则,直接指定分割基数为256。
         */
        int tlen = (len < 2 * INITIAL_TMP_STORAGE_LENGTH) ?
            len >>> 1 : INITIAL_TMP_STORAGE_LENGTH;
        // 基本入参校验,1 用户没有指定工作空间,2 工作空间可用空间长度小于分割基数,
        // 3 工作空间可用空间原点指针位置 + 分隔基数 > 可用空间数
        // 表示开发人员没有想自定义参数或指定的参数不合理,则使用默认的参数构造。        
        if (work == null || workLen < tlen || workBase + tlen > work.length) {
            @SuppressWarnings({"unchecked", "UnnecessaryLocalVariable"})
            T[] newArray = (T[])java.lang.reflect.Array.newInstance
                (a.getClass().getComponentType(), tlen);
            tmp = newArray;
            tmpBase = 0;
            tmpLen = tlen;
        }
        else {
            tmp = work;
            tmpBase = workBase;
            tmpLen = workLen;
        }

        /*
         * 分配要合并的运行堆栈(无法扩展)。堆栈长度在listsort.txt中有描述。
         * C版本总是使用85,但是这个长度在JAVA中对于中长度数组来说代价太大(例如100个元素的数组)。
         * 因此,我们为较小的数组指定较小的堆栈大小。下列计算中的神奇数字,在MIN_MERGE被声明后就要被改变。 
         * 最大的堆栈大小49,允许Integer最大值-4的数组长度,如果数组是最坏的情况,
         * 可以根据场景增加堆栈大小。更多的说明在下面地址的第4部分:
         * http://envisage-project.eu/wp-content/uploads/2015/02/sorting.pdf
         */
        int stackLen = (len <    120  ?  5 :
                        len <   1542  ? 10 :
                        len < 119151  ? 24 : 49);
        runBase = new int[stackLen];
        runLen = new int[stackLen];
    }

    /*
     * 下一个方法(包私有和静态)构成这个类的整个API。
     */

    /**
     * 在必要时使用工作空间数组进行临时存储,来对给定的范围进行排序。
     * 这个方法被设计,用在执行Arrays的公共方法sort的必要边界校验之后。
     * invoked from public methods (in class Arrays) after performing
     * any necessary array bounds checks and expanding parameters into
     * the required forms.
     *
     * @param a the array to be sorted 待排序数组
     * @param lo 排序的第一个元素下标, 包含该元素
     * @param hi 排序的最后一个元素下标, 不包含该元素
     * @param c the comparator to use 排序策略
     * @param work a workspace array (slice) 工作空间
     * @param workBase origin of usable space in work array 工作空间可用空间原点指针位置
     * @param workLen usable size of work array 工作空间可用空间大小
     * @since 1.8
     */
    static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c,
                         T[] work, int workBase, int workLen) {
        // 断言,只有为true才能继续执行,否则抛出java.lang.AssertionError
        // 待排序数组、排序策略不能为空,开始下标和结束下标要在数组的长度范围内,开始下标要小于等于结束下标
        assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length;
		// 排序区间的元素个数
        int nRemaining  = hi - lo;
        if (nRemaining < 2)
            return;  // 没有或1个就没有必要排序了

        // 如果数组小于MIN_MERGE(32)就不归并了,使用二分插排。
        if (nRemaining < MIN_MERGE) {
            int initRunLen = countRunAndMakeAscending(a, lo, hi, c);
            binarySort(a, lo, hi, lo + initRunLen, c);
            return;
        }

        /**
         * 从左到右遍历数组一次, 找到所有run区间(要归并的区间),扩大小的run区间到minRun个元素,
         * 把所有run区间归并到主要的数组栈内。
         */
        TimSort<T> ts = new TimSort<>(a, c, work, workBase, workLen);
		// 获得最小可接受的run长度
        int minRun = minRunLength(nRemaining);
        do {
            // Identify next run,从lo开始到要开始排序位置之间的长度
            int runLen = countRunAndMakeAscending(a, lo, hi, c);

            // 如果要运行的长度较短,就在(minRun, nRemaining)之间取较小值
            if (runLen < minRun) {
                int force = nRemaining <= minRun ? nRemaining : minRun;
                binarySort(a, lo, lo + force, lo + runLen, c);
                runLen = force;
            }

            // Push run onto pending-run stack, and maybe merge
            // 将每一轮的开始位置和长度压入run中挂起的堆栈中
            ts.pushRun(lo, runLen);
            // 归并存储run的堆栈中的所有run
            ts.mergeCollapse();

            // Advance to find next run
            lo += runLen;
            nRemaining -= runLen;
        } while (nRemaining != 0);

        // Merge all remaining runs to complete sort
        assert lo == hi;
        ts.mergeForceCollapse();
        assert ts.stackSize == 1;
    }

    /**
     * 对指定数组指定部分进行二分插入排序。这个方法是对小数量元素排序的最好方法。
     * 时间复杂度为O(n*logn),但最差的复杂度为O(n^2)。
     *
     * 如果初始指定要排序的部分已经排好序,这个方法能够获得好处:方法假设从开始到结束
     * 下标之间的元素已经排好序了。
     *
     * @param a the array in which a range is to be sorted 待排序数组
     * @param lo the index of the first element in the range to be sorted 开始下标
     * @param hi the index after the last element in the range to be sorted 结束下标
     * @param start the index of the first element in the range that is 要开始排序的下标
     *        not already known to be sorted ({@code lo <= start <= hi})
     * @param c comparator to used for the sort 比较策略
     */
    @SuppressWarnings("fallthrough")
    private static <T> void binarySort(T[] a, int lo, int hi, int start,
                                       Comparator<? super T> c) {
        // 断言,start必须在开始和结束区间内,才能执行下去
        assert lo <= start && start <= hi;
        if (start == lo)
            start++;
        // 从开始的后一个开始遍历
        for ( ; start < hi; start++) {
            // 执行插入排序
            T pivot = a[start];

            // Set left (and right) to the index where a[start] (pivot) belongs
            int left = lo;
            int right = start;
            assert left <= right;
            /*
             * Invariants:
             *   pivot >= all in [lo, left).
             *   pivot <  all in [right, start).
             */
            while (left < right) {
                int mid = (left + right) >>> 1;
                if (c.compare(pivot, a[mid]) < 0)
                    right = mid;
                else
                    left = mid + 1;
            }
            assert left == right;

            /*
             * 不变的是 pivot 大于等于所有左边的元素,小于等于右边所有元素。注意,如果有元素等于轴,
             * 那么左边的还是在左边右边的还是在右边,所以插入排序是稳定的。
             * 
             * 左右滑动元素,为轴腾出空间。
             */
            int n = start - left;  // The number of elements to move
            // Switch is just an optimization for arraycopy in default case
            switch (n) {
                case 2:  a[left + 2] = a[left + 1];
                case 1:  a[left + 1] = a[left];
                         break;
                default: System.arraycopy(a, left, a, left + 1, n);
            }
            a[left] = pivot;
        }
    }

    /**
     * 返回从指定数组的指定位置开始运行的长度,并确保方法返回的结果总是升序的
     *
     * A run is the longest ascending sequence with:
     *
     *    a[lo] <= a[lo + 1] <= a[lo + 2] <= ...
     *
     * or the longest descending sequence with:
     *
     *    a[lo] >  a[lo + 1] >  a[lo + 2] >  ...
     *
     * 用于稳定的归并排序, 降序需要被严格定义,这样才能安全地,不破坏稳定地调用方法反转一个降序序列。
     *
     * @param a the array in which a run is to be counted and possibly reversed
     * @param lo index of the first element in the run
     * @param hi index after the last element that may be contained in the run.
              It is required that {@code lo < hi}.
     * @param c the comparator to used for the sort
     * @return  the length of the run beginning at the specified position in
     *          the specified array
     */
    private static <T> int countRunAndMakeAscending(T[] a, int lo, int hi,
                                                    Comparator<? super T> c) {
        // 断言,仅在开始下标小于结束下标的情况下执行,即指定区间里至少有两个元素
        assert lo < hi;
        // 从指定区间的第二个元素开始向结束下标运行
        int runHi = lo + 1;
        if (runHi == hi) // 如果第二个就是结束位置了,就返回1,说明运行长度就一个位置
            return 1;

        // 找到运行的结束位置, 并且反转范围内的降序情况
        // 从开始位置遍历,如果头两个元素是升序,就遍历到不是的位置停止,如果头两个元素就是降序,
        // 就遍历到不是倒序的位置停止,然后将从开始到倒序停止的位置之间的元素反转,保证开发人员
        // 指定的要排序数组的最前面一部分或整个都是升序的,考虑的是极端的边界情况。
        if (c.compare(a[runHi++], a[lo]) < 0) { // Descending
            while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) < 0)
                runHi++;
            reverseRange(a, lo, runHi);
        } else {                              // Ascending
            while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) >= 0)
                runHi++;
        }
		// 得到的结果是长度,在binarySort中,将runHi作为start参数,即要开始插排的位置
        return runHi - lo;
    }

    /**
     * 反转降序的数组
     *
     * @param a the array in which a range is to be reversed
     * @param lo the index of the first element in the range to be reversed
     * @param hi the index after the last element in the range to be reversed
     */
    private static void reverseRange(Object[] a, int lo, int hi) {
        hi--;
        while (lo < hi) {
            Object t = a[lo];
            a[lo++] = a[hi];
            a[hi--] = t;
        }
    }

    /**
     * 返回给定数组最小可接受的运行长度。binarySort运行时如果比实际要小,会自动扩大。
     *
     * 粗略的说,计算如下:
     *  - 如果n < MIN_MERGE,返回n(太小了,不值得为花哨的东西烦恼)。
     *  - 如果n是2的倍数,就返回MIN_MERGE/2
     *  - 其他返回k,MIN_MERGE/2 <= k <= MIN_MERGE
     *
     * @param n the length of the array to be sorted 数组要排序的长度
     * @return the length of the minimum run to be merged
     */
    private static int minRunLength(int n) {
        // 断言,长度必须大于等于0
        assert n >= 0;
        int r = 0;      // Becomes 1 if any 1 bits are shifted off
        while (n >= MIN_MERGE) {
            r |= (n & 1);
            n >>= 1;
        }
        return n + r;
    }

    /**
     * Pushes the specified run onto the pending-run stack.
     *
     * @param runBase index of the first element in the run
     * @param runLen  the number of elements in the run
     */
    private void pushRun(int runBase, int runLen) {
        this.runBase[stackSize] = runBase;
        this.runLen[stackSize] = runLen;
        stackSize++;
    }

    /**
     * 检查等待合并的运行堆栈,并合并相邻的run,直到重新建立堆栈(归并后会生成一个新的run)
     * Examines the stack of runs waiting to be merged and merges adjacent runs
     * until the stack invariants are reestablished:
     *
     *     1. runLen[i - 3] > runLen[i - 2] + runLen[i - 1]
     *     2. runLen[i - 2] > runLen[i - 1]
     * 每次将新run推入堆栈时都会调用此方法,因此保证在进入该方法时对i < stackSize保持不变。
     * This method is called each time a new run is pushed onto the stack,
     * so the invariants are guaranteed to hold for i < stackSize upon
     * entry to the method.
     */
    private void mergeCollapse() {
        // 只有run堆栈中存在2个或以上的run才执行
        while (stackSize > 1) {
            // run堆栈大小-2,得到的n
            int n = stackSize - 2;
            // 如果堆栈中还有run、
            if (n > 0 && runLen[n-1] <= runLen[n] + runLen[n+1]) {
                if (runLen[n - 1] < runLen[n + 1])
                    n--;
                mergeAt(n);
            } else if (runLen[n] <= runLen[n + 1]) {
                mergeAt(n);
            } else {
                break; // Invariant is established
            }
        }
    }

    /**
     * Merges all runs on the stack until only one remains.  This method is
     * called once, to complete the sort.
     * 归并所有run堆栈中的run,直到只剩一个。这个方法只会调用一次来完成排序。
     */
    private void mergeForceCollapse() {
        while (stackSize > 1) {
            int n = stackSize - 2;
            if (n > 0 && runLen[n - 1] < runLen[n + 1])
                n--;
            mergeAt(n);
        }
    }

    /**
     * 归并堆栈中指定位置的run和它后面相邻的run。i必须是run堆栈中的倒数第二或倒数第三个下标。
     * 换句话说,i必须等于stackSize-2 或 stackSize-3。
     *
     * @param i stack index of the first of the two runs to merge
     */
    private void mergeAt(int i) {
        // 断言,run堆栈大小必须大于等于2,里面至少有2个run
        assert stackSize >= 2;
        // 指定的下标必须合理
        assert i >= 0;
        assert i == stackSize - 2 || i == stackSize - 3;

        int base1 = runBase[i];  // 左边run的开始位置
        int len1 = runLen[i];	 // 左边run的长度
        int base2 = runBase[i + 1];	 // 右边run的开始位置
        int len2 = runLen[i + 1];	 // 右边run的长度
        assert len1 > 0 && len2 > 0;	// 断言,两个run的长度必须大于0
        assert base1 + len1 == base2;  // 断言,两个run必须是挨着的

        /*
         * Record the length of the combined runs; if i is the 3rd-last
         * run now, also slide over the last run (which isn't involved
         * in this merge).  The current run (i+1) goes away in any case.
         */
        // 更新左边的run长度,合并左右两个run的长度
        runLen[i] = len1 + len2;
        // 如果i指向倒数第3个,现在的那倒数第二个将直接记录现在的倒数第一个run的相关参数,成为新的倒数第一
        if (i == stackSize - 3) { 
            runBase[i + 1] = runBase[i + 2];
            runLen[i + 1] = runLen[i + 2];
        }
        // run堆栈大小缩减1
        stackSize--;

        /*
         * 找到run2第一个元素,要排到run1中的位置下标。
         */
        // k:run2第一个元素,要排到run1中的某个位置,这个位置距离base1的长度
        int k = gallopRight(a[base2], a, base1, len1, 0, c);
        assert k >= 0;
        // 移动run1的开始位置指针base1,移动k个长度
        base1 += k; 
        // run1的长度可以相应减少k
        len1 -= k;
        if (len1 == 0) // 边界判断,如果k刚好就是原来run1的长度,则len1就是0了
            return;  // 这种情况表示run1和run2已经排好序了,就不用继续执行下面的代码了

        /*
         * 找到run1最后一个元素,要排到run2中的某个位置,方法与上面类似。
         */
        len2 = gallopLeft(a[base1 + len1 - 1], a, base2, len2, len2 - 1, c);
        assert len2 >= 0;
        if (len2 == 0)
            return;

        // Merge remaining runs, using tmp array with min(len1, len2) elements
        // len1和len2两个run的长度,以小的那个为基础,归并长的那个,这样长的run中多余的就不
        // 需要继续循环,而是直接拷贝过去
        if (len1 <= len2)
            mergeLo(base1, len1, base2, len2);
        else
            mergeHi(base1, len1, base2, len2);
    }

    /**
     * Locates the position at which to insert the specified key into the
     * specified sorted range; if the range contains an element equal to key,
     * returns the index of the leftmost equal element.
     *
     * @param key the key whose insertion point to search for
     * @param a the array in which to search
     * @param base the index of the first element in the range
     * @param len the length of the range; must be > 0
     * @param hint the index at which to begin the search, 0 <= hint < n.
     *     The closer hint is to the result, the faster this method will run.
     * @param c the comparator used to order the range, and to search
     * @return the int k,  0 <= k <= n such that a[b + k - 1] < key <= a[b + k],
     *    pretending that a[b - 1] is minus infinity and a[b + n] is infinity.
     *    In other words, key belongs at index b + k; or in other words,
     *    the first k elements of a should precede key, and the last n - k
     *    should follow it.
     */
    private static <T> int gallopLeft(T key, T[] a, int base, int len, int hint,
                                      Comparator<? super T> c) {
        assert len > 0 && hint >= 0 && hint < len;
        int lastOfs = 0;
        int ofs = 1;
        if (c.compare(key, a[base + hint]) > 0) {
            // Gallop right until a[base+hint+lastOfs] < key <= a[base+hint+ofs]
            int maxOfs = len - hint;
            while (ofs < maxOfs && c.compare(key, a[base + hint + ofs]) > 0) {
                lastOfs = ofs;
                ofs = (ofs << 1) + 1;
                if (ofs <= 0)   // int overflow
                    ofs = maxOfs;
            }
            if (ofs > maxOfs)
                ofs = maxOfs;

            // Make offsets relative to base
            lastOfs += hint;
            ofs += hint;
        } else { // key <= a[base + hint]
            // Gallop left until a[base+hint-ofs] < key <= a[base+hint-lastOfs]
            final int maxOfs = hint + 1;
            while (ofs < maxOfs && c.compare(key, a[base + hint - ofs]) <= 0) {
                lastOfs = ofs;
                ofs = (ofs << 1) + 1;
                if (ofs <= 0)   // int overflow
                    ofs = maxOfs;
            }
            if (ofs > maxOfs)
                ofs = maxOfs;

            // Make offsets relative to base
            int tmp = lastOfs;
            lastOfs = hint - ofs;
            ofs = hint - tmp;
        }
        assert -1 <= lastOfs && lastOfs < ofs && ofs <= len;

        /*
         * Now a[base+lastOfs] < key <= a[base+ofs], so key belongs somewhere
         * to the right of lastOfs but no farther right than ofs.  Do a binary
         * search, with invariant a[base + lastOfs - 1] < key <= a[base + ofs].
         */
        lastOfs++;
        while (lastOfs < ofs) {
            int m = lastOfs + ((ofs - lastOfs) >>> 1);

            if (c.compare(key, a[base + m]) > 0)
                lastOfs = m + 1;  // a[base + m] < key
            else
                ofs = m;          // key <= a[base + m]
        }
        assert lastOfs == ofs;    // so a[base + ofs - 1] < key <= a[base + ofs]
        return ofs;
    }

    /**
     * 类似gallopLeft方法, 如果指定范围内包含一个元素与key相等,期望gallopRight方法返回
     * 最右边相等元素之后的下标。
     *
     * @param key 要搜索其插入点的key
     * @param a 在这个数组上查找
     * @param base 范围内第一个开始下标
     * @param len 查找范围的长度,必须大于0
     * @param hint 开始查询的下标, 0 <= hint < n。开始查询的下标越接近结果,这个方法执行
     * 的就越快。
     * @param c 比较策略
     * @return the int k,  0 <= k <= n such that a[b + k - 1] <= key < a[b + k]
     */
    private static <T> int gallopRight(T key, T[] a, int base, int len,
                                       int hint, Comparator<? super T> c) {
        assert len > 0 && hint >= 0 && hint < len;

        int ofs = 1;
        int lastOfs = 0;
        if (c.compare(key, a[base + hint]) < 0) {
            // Gallop left until a[b+hint - ofs] <= key < a[b+hint - lastOfs]
            int maxOfs = hint + 1;
            while (ofs < maxOfs && c.compare(key, a[base + hint - ofs]) < 0) {
                lastOfs = ofs;
                ofs = (ofs << 1) + 1;
                if (ofs <= 0)   // int overflow
                    ofs = maxOfs;
            }
            if (ofs > maxOfs)
                ofs = maxOfs;

            // Make offsets relative to b
            int tmp = lastOfs;
            lastOfs = hint - ofs;
            ofs = hint - tmp;
        } else { // a[b + hint] <= key
            // Gallop right until a[b+hint + lastOfs] <= key < a[b+hint + ofs]
            int maxOfs = len - hint;
            while (ofs < maxOfs && c.compare(key, a[base + hint + ofs]) >= 0) {
                lastOfs = ofs;
                ofs = (ofs << 1) + 1;
                if (ofs <= 0)   // int overflow
                    ofs = maxOfs;
            }
            if (ofs > maxOfs)
                ofs = maxOfs;

            // Make offsets relative to b
            lastOfs += hint;
            ofs += hint;
        }
        assert -1 <= lastOfs && lastOfs < ofs && ofs <= len;

        /*
         * Now a[b + lastOfs] <= key < a[b + ofs], so key belongs somewhere to
         * the right of lastOfs but no farther right than ofs.  Do a binary
         * search, with invariant a[b + lastOfs - 1] <= key < a[b + ofs].
         */
        lastOfs++;
        while (lastOfs < ofs) {
            int m = lastOfs + ((ofs - lastOfs) >>> 1);

            if (c.compare(key, a[base + m]) < 0)
                ofs = m;          // key < a[b + m]
            else
                lastOfs = m + 1;  // a[b + m] <= key
        }
        assert lastOfs == ofs;    // so a[b + ofs - 1] <= key < a[b + ofs]
        return ofs;
    }

    /**
     * 在适当的位置稳定地归并两个相邻的run。第一个run的第一个元素必须必第二个run的第一个元素大,
     * 并且第一个run的最后一个元素必须必第二个run中的所有元素都要大。
     * Merges two adjacent runs in place, in a stable fashion.  The first
     * element of the first run must be greater than the first element of the
     * second run (a[base1] > a[base2]), and the last element of the first run
     * (a[base1 + len1-1]) must be greater than all elements of the second run.
     *
     * 对于性能,这个方法必须在第一个run长度小于等于第二个run长度时调用,另一个方法mergeHi则
     * 需要在第一个run长度大于等于第二个run长度的时候调用。(它们都可以在两个run长度相等时调用)
     * For performance, this method should be called only when len1 <= len2;
     * its twin, mergeHi should be called if len1 >= len2.  (Either method
     * may be called if len1 == len2.)
     *
     * @param base1 index of first element in first run to be merged
     * @param len1  length of first run to be merged (must be > 0)
     * @param base2 index of first element in second run to be merged
     *        (must be aBase + aLen)
     * @param len2  length of second run to be merged (must be > 0)
     */
    private void mergeLo(int base1, int len1, int base2, int len2) {
        // 断言,两个run的长度都要大于0,且两个run是相邻的
        assert len1 > 0 && len2 > 0 && base1 + len1 == base2;

        // Copy first run into temp array 这部分是将第一个run复制到了一个临时数组
        T[] a = this.a; // For performance
        T[] tmp = ensureCapacity(len1);
        int cursor1 = tmpBase; // Indexes into tmp array 光标1,临时数组(run1)开始位置,0
        int cursor2 = base2;   // Indexes int a 光标2,run2开始位置
        int dest = base1;      // Indexes int a run1的开始位置
        System.arraycopy(a, base1, tmp, cursor1, len1);

        // Move first element of second run and deal with degenerate cases
        // 移动第二个run的第一个元素,并处理退化的情况
        a[dest++] = a[cursor2++];// run1的第一个元素设置为run2的第一个元素(对a操作)
        if (--len2 == 0) {// 如果run2只有1个元素,把temp中存储的覆盖到已经改变的a中
            System.arraycopy(tmp, cursor1, a, dest, len1);
            return;
        }
        if (len1 == 1) {// 如果run1只有一个元素,把run2拷贝到原run1的位置
            System.arraycopy(a, cursor2, a, dest, len2);
            // run1的第一个元素
            a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge
            return;
        }

        Comparator<? super T> c = this.c;  // Use local variable for performance
        int minGallop = this.minGallop;    //  "    "       "     "      "
    outer:
        // 开始归并操作
        while (true) {
            int count1 = 0; // Number of times in a row that first run won
            int count2 = 0; // Number of times in a row that second run won

            /*
             * Do the straightforward thing until (if ever) one run starts
             * winning consistently.
             */
            do {
                assert len1 > 1 && len2 > 0;
                if (c.compare(a[cursor2], tmp[cursor1]) < 0) {
                    a[dest++] = a[cursor2++];
                    count2++;
                    count1 = 0;
                    if (--len2 == 0)
                        break outer;
                } else {
                    a[dest++] = tmp[cursor1++];
                    count1++;
                    count2 = 0;
                    if (--len1 == 1)
                        break outer;
                }
            } while ((count1 | count2) < minGallop);

            /*
             * One run is winning so consistently that galloping may be a
             * huge win. So try that, and continue galloping until (if ever)
             * neither run appears to be winning consistently anymore.
             */
            do {
                assert len1 > 1 && len2 > 0;
                count1 = gallopRight(a[cursor2], tmp, cursor1, len1, 0, c);
                if (count1 != 0) {
                    System.arraycopy(tmp, cursor1, a, dest, count1);
                    dest += count1;
                    cursor1 += count1;
                    len1 -= count1;
                    if (len1 <= 1) // len1 == 1 || len1 == 0
                        break outer;
                }
                a[dest++] = a[cursor2++];
                if (--len2 == 0)
                    break outer;

                count2 = gallopLeft(tmp[cursor1], a, cursor2, len2, 0, c);
                if (count2 != 0) {
                    System.arraycopy(a, cursor2, a, dest, count2);
                    dest += count2;
                    cursor2 += count2;
                    len2 -= count2;
                    if (len2 == 0)
                        break outer;
                }
                a[dest++] = tmp[cursor1++];
                if (--len1 == 1)
                    break outer;
                minGallop--;
            } while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
            if (minGallop < 0)
                minGallop = 0;
            minGallop += 2;  // Penalize for leaving gallop mode
        }  // End of "outer" loop
        this.minGallop = minGallop < 1 ? 1 : minGallop;  // Write back to field

        if (len1 == 1) {
            assert len2 > 0;
            System.arraycopy(a, cursor2, a, dest, len2);
            a[dest + len2] = tmp[cursor1]; //  Last elt of run 1 to end of merge
        } else if (len1 == 0) {
            throw new IllegalArgumentException(
                "Comparison method violates its general contract!");
        } else {
            assert len2 == 0;
            assert len1 > 1;
            System.arraycopy(tmp, cursor1, a, dest, len1);
        }
    }

    /**
     * Like mergeLo, except that this method should be called only if
     * len1 >= len2; mergeLo should be called if len1 <= len2.  (Either method
     * may be called if len1 == len2.)
     *
     * @param base1 index of first element in first run to be merged
     * @param len1  length of first run to be merged (must be > 0)
     * @param base2 index of first element in second run to be merged
     *        (must be aBase + aLen)
     * @param len2  length of second run to be merged (must be > 0)
     */
    private void mergeHi(int base1, int len1, int base2, int len2) {
        assert len1 > 0 && len2 > 0 && base1 + len1 == base2;

        // Copy second run into temp array
        T[] a = this.a; // For performance
        T[] tmp = ensureCapacity(len2);
        int tmpBase = this.tmpBase;
        System.arraycopy(a, base2, tmp, tmpBase, len2);

        int cursor1 = base1 + len1 - 1;  // Indexes into a
        int cursor2 = tmpBase + len2 - 1; // Indexes into tmp array
        int dest = base2 + len2 - 1;     // Indexes into a

        // Move last element of first run and deal with degenerate cases
        a[dest--] = a[cursor1--];
        if (--len1 == 0) {
            System.arraycopy(tmp, tmpBase, a, dest - (len2 - 1), len2);
            return;
        }
        if (len2 == 1) {
            dest -= len1;
            cursor1 -= len1;
            System.arraycopy(a, cursor1 + 1, a, dest + 1, len1);
            a[dest] = tmp[cursor2];
            return;
        }

        Comparator<? super T> c = this.c;  // Use local variable for performance
        int minGallop = this.minGallop;    //  "    "       "     "      "
    outer:
        while (true) {
            int count1 = 0; // Number of times in a row that first run won
            int count2 = 0; // Number of times in a row that second run won

            /*
             * Do the straightforward thing until (if ever) one run
             * appears to win consistently.
             */
            do {
                assert len1 > 0 && len2 > 1;
                if (c.compare(tmp[cursor2], a[cursor1]) < 0) {
                    a[dest--] = a[cursor1--];
                    count1++;
                    count2 = 0;
                    if (--len1 == 0)
                        break outer;
                } else {
                    a[dest--] = tmp[cursor2--];
                    count2++;
                    count1 = 0;
                    if (--len2 == 1)
                        break outer;
                }
            } while ((count1 | count2) < minGallop);

            /*
             * One run is winning so consistently that galloping may be a
             * huge win. So try that, and continue galloping until (if ever)
             * neither run appears to be winning consistently anymore.
             */
            do {
                assert len1 > 0 && len2 > 1;
                count1 = len1 - gallopRight(tmp[cursor2], a, base1, len1, len1 - 1, c);
                if (count1 != 0) {
                    dest -= count1;
                    cursor1 -= count1;
                    len1 -= count1;
                    System.arraycopy(a, cursor1 + 1, a, dest + 1, count1);
                    if (len1 == 0)
                        break outer;
                }
                a[dest--] = tmp[cursor2--];
                if (--len2 == 1)
                    break outer;

                count2 = len2 - gallopLeft(a[cursor1], tmp, tmpBase, len2, len2 - 1, c);
                if (count2 != 0) {
                    dest -= count2;
                    cursor2 -= count2;
                    len2 -= count2;
                    System.arraycopy(tmp, cursor2 + 1, a, dest + 1, count2);
                    if (len2 <= 1)  // len2 == 1 || len2 == 0
                        break outer;
                }
                a[dest--] = a[cursor1--];
                if (--len1 == 0)
                    break outer;
                minGallop--;
            } while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
            if (minGallop < 0)
                minGallop = 0;
            minGallop += 2;  // Penalize for leaving gallop mode
        }  // End of "outer" loop
        this.minGallop = minGallop < 1 ? 1 : minGallop;  // Write back to field

        if (len2 == 1) {
            assert len1 > 0;
            dest -= len1;
            cursor1 -= len1;
            System.arraycopy(a, cursor1 + 1, a, dest + 1, len1);
            a[dest] = tmp[cursor2];  // Move first elt of run2 to front of merge
        } else if (len2 == 0) {
            throw new IllegalArgumentException(
                "Comparison method violates its general contract!");
        } else {
            assert len1 == 0;
            assert len2 > 0;
            System.arraycopy(tmp, tmpBase, a, dest - (len2 - 1), len2);
        }
    }

    /**
     * Ensures that the external array tmp has at least the specified
     * number of elements, increasing its size if necessary.  The size
     * increases exponentially to ensure amortized linear time complexity.
     *
     * @param minCapacity the minimum required capacity of the tmp array
     * @return tmp, whether or not it grew
     */
    private T[] ensureCapacity(int minCapacity) {
        if (tmpLen < minCapacity) {
            // Compute smallest power of 2 > minCapacity
            int newSize = minCapacity;
            newSize |= newSize >> 1;
            newSize |= newSize >> 2;
            newSize |= newSize >> 4;
            newSize |= newSize >> 8;
            newSize |= newSize >> 16;
            newSize++;

            if (newSize < 0) // Not bloody likely!
                newSize = minCapacity;
            else
                newSize = Math.min(newSize, a.length >>> 1);

            @SuppressWarnings({"unchecked", "UnnecessaryLocalVariable"})
            T[] newArray = (T[])java.lang.reflect.Array.newInstance
                (a.getClass().getComponentType(), newSize);
            tmp = newArray;
            tmpLen = newSize;
            tmpBase = 0;
        }
        return tmp;
    }
}

学习笔记

1 成员变量

变量名变量类型变量值变量解释
tmpT[]new T[50]临时数组
tmpBaseint0
tmpLenint50
stackSizeint0归并的次数,初始为0
runBaseint[]new int[5]记录着待排序数组(a)中每一个排好序待归并部分的起始位置
runLenint[]new int[5]记录着待排序数组(a)中每一个排好序待归并部分的长度(元素个数)

2 sort方法入参

变量名变量类型变量值变量解释
aT[]Object可排序的对象
loint0指定待排序的开始位置下标(含)
hiint100指定待排序的结束位置下标(不含)
cComparatorComparator比较策略
work
workBase
workLen

3 sort方法阅读笔记

注意:以下run将作为,每一个待排序数组中,指定要排序区间内,已经按照升序排好的,待归并的区间的称呼。

3.1 计算最小允许的run长度,以a.length=100来算

假设有一个待排序的数组,数组长度为100。

数组的其中一部分为:{ 5 , 9 , 3, 6, 2, 1, 7 …… 8}

int minRun = minRunLength(nRemaining);   // 25
3.2 开始循环(do-while)

步骤:

  1. 获得从每次迭代的初始位置开始,到下一个run之间的长度
  2. 如果上一步计算到的长度不够25(minRun),就用二分插排法(binarySort)排序lo下标开始的25个元素
  3. 把排好序的“起始下标”、“run长度”两个参数分别存入runBase和runLen成员变量中
  4. mergeCollapse
  5. 边界值计算、判断是否还有未完成工作,结束或继续下一轮循环
3.2.1 计算最小run的长度

​ 涉及较复杂的考虑、计算没研究,此处不作描述。

3.2.2 计算下一个run的长度

这里分两种情况进行讨论,第二种仅仅是作为说明。

​ 3.2.2.1 假设a数组如下(最初设定的数组):

​ { 5 , 9 , 3, 6, 2, 1, 7 …… 8}
在这里插入图片描述
​ 上图描述了获得下一个run长度的过程:第一步是第一和第二个元素进行比较,如果左<右则继续比较,​ 作为指针的runHi开始是指向第二个元素,比较后指向下一个元素(3);然后开始第二次比较,但是此时​ 左(9)> 右(3),就停止比较,得到了要排序的位置,此时runHi是在下标2的位置,那么这时得到的run​ 长度为runHi - lo = 2。

3.2.2.2 假设a数组如下(临时说明):

​ { 9 , 5 , 3, 6, 2, 1, 7 …… 8} 这个数组与原数组仅仅是第一和第二元素的位置换了一下
在这里插入图片描述
​ 上图描述了获得下一个run长度的过程:与3.2.2.1不同的是,第一个元素大于第二个元素,这种情况下,​ 作者进行了降序极端情况的优化,直接继续比较后续的每对相邻元素,直到非倒序的情况(反转前-绿),​ 如上图中的runHi;当满足这种情况时,算法中会将前3个降序排列的元素反转过来(反转后-黄),方法返回​ 的run长度计算公式一样,runHi - lo = 3。

3.2.3 判断是否需要用二分插排

​ 作者通过一套测算较为完善,且性能较为优越的方法,得到了最小的run长度,如果开发人员在3.2.2中得到的长度小于默认的长度,就会使用binarySort(二分查找插入排序)排序好最小run长度个相邻元素,然后将该元素区间作为参数记录到成员变量的运行相关变量中,等待归并操作。

3.2.4 将上面得到的run的信息存入run堆栈
/**
 * runBase:已经排好序的开始位置,runLen:这个run区间的长度
 */
private void pushRun(int runBase, int runLen) {
    this.runBase[stackSize] = runBase;
    this.runLen[stackSize] = runLen;
    stackSize++; // 每存入一对run参数,size就要+1,记录总个数
}
3.2.5 执行归并

​ 归并存储在run堆栈中的所有run。

3.2.5.1 归并条件

​ 归并前先判断run堆栈中的run是否超过1个,否则没有归并的必要。

3.2.5.2 归并

​ 该归并是传统的归并算法,但是是基于两个已经排好序的数组(run)来归并。

3.2.6 边界处理

​ 处理边界值,结束本次循环前要为下一次循环计算好参数,然后开始判断是否满足条件进行下一次循环。

3.3 结束循环,最后扫尾归并一次
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值