在工作的时候用到了对泛型集合排序,本来以为里面也是像排序数组一样使用双轴快排,但是跟着Collections.sort()源码里发现是用了TimSort.sort()的排序,去网上简单的搜索了一下,说TimSort用了归并排序,并极大程度的利用了自然界很多数都已经拍好序了这个规律, 其中比较好的文章是:世界上最快的排序算法——Timsort,我在看完源码之后虽然看"懂"了,TimSort它是怎么做的,但是不清楚它为什么这么做,这篇文章就解答了我很多疑惑。为了帮助大家更好的理解TimSort,也为了让我记录一下Timsort的笔记,所以我写了这篇博客来和大家分享一下我对JDK1.8版本源码TimSort的阅读心得。
我们先跟Collections.sort()进去发现是调用传进来集合的list.sort(),代码如下。可以看到主要干了三件事情,先是调用本身的toArray()方法生成一个数组,再调用Arrays.sort()方法对数组进行排序,最后将排序好的结果再塞回list里
default void sort(Comparator<? super E> c) {
Object[] a = this.toArray();
Arrays.sort(a, (Comparator) c);
ListIterator<E> i = this.listIterator();
for (Object e : a) {
i.next();
i.set((E) e);
}
}
下面是Arrays.sort()的代码,可以看到如果没有传比较器的话就会调用ComparableTimSort.sort(),如果传了比较器则直接使用Tim.sort(), 至于legacyMergeSort(a, c)是已经被淘汰的方法。
public static <T> void sort(T[] a, Comparator<? super T> c) {
if (c == null) {
sort(a);
} else {
if (LegacyMergeSort.userRequested)
legacyMergeSort(a, c);
else
TimSort.sort(a, 0, a.length, c, null, 0, 0);
}
}
我们直接跟进去TimSort.sort()方法,ComparableTimSort里的sort方法逻辑是和Timsort一样的,只不过是元素间进行比较的时候有区别而已。
这个方法是完成整个排序的核心方法,所有的排序都会在这个方法里完成。
改方法里调用的其它方法我都会下面进行详细讲解。
static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c,
T[] work, int workBase, int workLen) {
assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length;
int nRemaining = hi - lo;
if (nRemaining < 2)
return; // Arrays of size 0 and 1 are always sorted
// If array is small, do a "mini-TimSort" with no merges
// private static final int MIN_MERGE = 32;
// 如果要排序的部分小于这个最小的阈值,则进行二分插值排序
if (nRemaining < MIN_MERGE) {
// 首先计算出数组刚开始已经排序好的部分
int initRunLen = countRunAndMakeAscending(a, lo, hi, c);
// 然后对剩下的部分进行排序
binarySort(a, lo, hi, lo + initRunLen, c);
return;
}
/**
* March over the array once, left to right, finding natural runs,
* extending short natural runs to minRun elements, and merging runs
* to maintain stack invariant.
*/
// 接下来构造一个Timsort
TimSort<T> ts = new TimSort<>(a, c, work, workBase, workLen);
// 计算出单个run的最小长度,run可以看出是原数组的一个排序段,每个run内部都是升序的
int minRun = minRunLength(nRemaining);
do {
// Identify next run
// 首先计算出数组刚已经排序好的部分
int runLen = countRunAndMakeAscending(a, lo, hi, c);
// If run is short, extend to min(minRun, nRemaining)
// 如果数组中有序部分的长度无法满足run的最小长度
if (runLen < minRun) {
// 取还需要排序的字段和minRun 的最小值
int force = nRemaining <= minRun ? nRemaining : minRun;
// 对force部分进行二分插值排序
binarySort(a, lo, lo + force, lo + runLen, c);
runLen = force;
}
// Push run onto pending-run stack, and maybe merge
// 至此run里所有的元素都是升序的,将该run放在栈中
ts.pushRun(lo, runLen);
// 合并run
ts.mergeCollapse();
// Advance to find next run
// 移动lo,寻找下一个run
lo += runLen;
// 更新还需要排序的长度
nRemaining -= runLen;
} while (nRemaining != 0);
// Merge all remaining runs to complete sort
assert lo == hi;
// 强制合并剩余的所有run
ts.mergeForceCollapse();
assert ts.stackSize == 1;
}
countRunAndMakeAscending方法,从这个方法就可以看出TimSort非常相信在一个数组里,有很多段是有序的,无论它是降序还是升序。
private static <T> int countRunAndMakeAscending(T[] a, int lo, int hi,
Comparator<? super T> c) {
// 断言校验传进的值合法
assert lo < hi;
int runHi = lo + 1;
if (runHi == hi)
return 1;
// Find end of run, and reverse range if descending
// 如果发现刚开始是降序的,那么找到降序的最后一个元素的下班,去给它进行反转
if (c.compare(a[runHi++], a[lo]) < 0) { // Descending
while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) < 0)
runHi++;
// 反转找到的降序的部分
reverseRange(a, lo, runHi);
} else { // Ascending
// 找到升序元素的最后一个元素的下标
while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) >= 0)
runHi++;
}
return runHi - lo;
}
/**
* 一个很简单的左右指针反转数组元素方法
*/
private static void reverseRange(Object[] a, int lo, int hi) {
hi--;
while (lo < hi) {
Object t = a[lo];
a[lo++] = a[hi];
a[hi--] = t;
}
}
binarySort方法
private static <T> void binarySort(T[] a, int lo, int hi, int start,
Comparator<? super T> c) {
assert lo <= start && start <= hi;
if (start == lo)
start++;
for ( ; start < hi; start++) {
// 在循环里先是取出未排序区间的第一个元素
T pivot = a[start];
// Set left (and right) to the index where a[start] (pivot) belongs
int left = lo;
int right = start;
assert left <= right;
/*
* Invariants:
* pivot >= all in [lo, left).
* pivot < all in [right, start).
*/
// 接着在已经排序好的字段区间里通过二分查找的方法,找到该元素可以插入的下标,这也就是该方法为什么被称为二分插值排序的主要原因
while (left < right) {
int mid = (left + right) >>> 1;
if (c.compare(pivot, a[mid]) < 0)
right = mid;
else
left = mid + 1;
}
assert left == right;
/*
* The invariants still hold: pivot >= all in [lo, left) and
* pivot < all in [left, start), so pivot belongs at left. Note
* that if there are elements equal to pivot, left points to the
* first slot after them -- that's why this sort is stable.
* Slide elements over to make room for pivot.
*/
int n = start - left; // The number of elements to move
// Switch is just an optimization for arraycopy in default case
// 针对较少移动的元素做了优化
// 注意这里有一个小细节,如果n=2,下面的switch会执行两行代码,直到break;
// 我去看了一下switch编译成的字节码文件,switch将匹配和执行代码是分开的,也就是说switch中的语句在编译成的字节码中是按顺序排列的,在switch匹配到某一个值的时候,它会直接跳转到对应的代码行去执行往下执行,直到遇到break;(break在字节码中变为了goto指令)
switch (n) {
case 2: a[left + 2] = a[left + 1];
case 1: a[left + 1] = a[left];
break;
// 使用copy方法向后移动一位所有比pivot的元素
default: System.arraycopy(a, left, a, left + 1, n);
}
a[left] = pivot;
}
}
TimSort的构造方法
private TimSort(T[] a, Comparator<? super T> c, T[] work, int workBase, int workLen) {
this.a = a;
this.c = c;
// Allocate temp storage (which may be increased later if necessary)
int len = a.length;
// 设置在归并排序中用到的额外的空间数组,最大为256,如果排序长度小于512则使用len长度的一半
// private static final int INITIAL_TMP_STORAGE_LENGTH = 256;
int tlen = (len < 2 * INITIAL_TMP_STORAGE_LENGTH) ?
len >>> 1 : INITIAL_TMP_STORAGE_LENGTH;
if (work == null || workLen < tlen || workBase + tlen > work.length) {
@SuppressWarnings({"unchecked", "UnnecessaryLocalVariable"})
T[] newArray = (T[])java.lang.reflect.Array.newInstance
(a.getClass().getComponentType(), tlen);
tmp = newArray;
tmpBase = 0;
tmpLen = tlen;
}
else {
tmp = work;
tmpBase = workBase;
tmpLen = workLen;
}
/*
* Allocate runs-to-be-merged stack (which cannot be expanded). The
* stack length requirements are described in listsort.txt. The C
* version always uses the same stack length (85), but this was
* measured to be too expensive when sorting "mid-sized" arrays (e.g.,
* 100 elements) in Java. Therefore, we use smaller (but sufficiently
* large) stack lengths for smaller arrays. The "magic numbers" in the
* computation below must be changed if MIN_MERGE is decreased. See
* the MIN_MERGE declaration above for more information.
* The maximum value of 49 allows for an array up to length
* Integer.MAX_VALUE-4, if array is filled by the worst case stack size
* increasing scenario. More explanations are given in section 4 of:
* http://envisage-project.eu/wp-content/uploads/2015/02/sorting.pdf
*/
// 设置归并排序中用到的栈,根据栈单调递减和相近合并两个特性,栈收敛的速度是斐波那契数列一样,由此可以计算出栈的深度,一定是比log1.618N小的。(此处不明白的同学可以看一下我开头推荐的文章,或者可以忽略只是了解它是排序时所用到的栈即可)
int stackLen = (len < 120 ? 5 :
len < 1542 ? 10 :
len < 119151 ? 24 : 49);
runBase = new int[stackLen];
runLen = new int[stackLen];
}
minRunLength方法
// 该方法的计算出run的最小值是一个经验值,根据n的大小自适应计算,至于使用该算法的原因目前我还不知道,特别感兴趣的同学可以查阅jdk官方文档
private static int minRunLength(int n) {
assert n >= 0;
int r = 0; // Becomes 1 if any 1 bits are shifted off
// 如果n大于32则一直缩短为原来的1/2,最终如果过程中n的低位有1,则加一返回
// private static final int MIN_MERGE = 32;
while (n >= MIN_MERGE) {
r |= (n & 1);
n >>= 1;
}
return n + r;
}
ts.pushRun入栈方法
// runBase是该run第一个元素的下标,runLen是该run的长度
// run虽然定义上是将原数组给分割开,但实际上只是用栈来记录run开始的下标和run的长度
rivate void pushRun(int runBase, int runLen) {
this.runBase[stackSize] = runBase;
this.runLen[stackSize] = runLen;
stackSize++;
}
ts.mergeCollapse合并run
private void mergeCollapse() {
while (stackSize > 1) {
int n = stackSize - 2;
// 如果栈顶第三个run长度小于前两个run长度之和
if (n > 0 && runLen[n-1] <= runLen[n] + runLen[n+1]) {
// 先合并相近的,这是由栈从底到顶单调递减所决定的
if (runLen[n - 1] < runLen[n + 1])
n--;
mergeAt(n);
} else if (runLen[n] <= runLen[n + 1]) {
// 如果新入栈的run比栈顶的大,直接合并
mergeAt(n);
} else {
break; // Invariant is established
}
}
}
mergeAt(n)合并栈中n和n+1的方法
private void mergeAt(int i) {
assert stackSize >= 2;
assert i >= 0;
assert i == stackSize - 2 || i == stackSize - 3;
// 从栈中去出要合并的两个run
int base1 = runBase[i];
int len1 = runLen[i];
int base2 = runBase[i + 1];
int len2 = runLen[i + 1];
assert len1 > 0 && len2 > 0;
assert base1 + len1 == base2;
/*
* Record the length of the combined runs; if i is the 3rd-last
* run now, also slide over the last run (which isn't involved
* in this merge). The current run (i+1) goes away in any case.
*/
// 设置两个run合并后的长度
runLen[i] = len1 + len2;
// 如果合并的run的栈顶第三个和栈顶第二个,那么把第一个的值覆盖到第二个,栈的size再减一
if (i == stackSize - 3) {
runBase[i + 1] = runBase[i + 2];
runLen[i + 1] = runLen[i + 2];
}
stackSize--;
/*
* Find where the first element of run2 goes in run1. Prior elements
* in run1 can be ignored (because they're already in place).
*/
// 寻找run2第一个元素,在run1里最右边的位置
// 这个是为后续寻找需要合并长度的一个准备
int k = gallopRight(a[base2], a, base1, len1, 0, c);
assert k >= 0;
// k表明前k个元素是不需要排序的,因为a[base]是run2最小的元素,而run1里已经有k个元素比其它元素都小,因此不需要再进行排序
base1 += k;
len1 -= k;
// len1等于0说明run1所有的元素都比run2最小的元素都小,而每个run都是有序的,因此直接返回
if (len1 == 0)
return;
/*
* Find where the last element of run1 goes in run2. Subsequent elements
* in run2 can be ignored (because they're already in place).
*/
// 同样的这个方法寻找run1中最大的元素在run2中插入的位置
// 找出的这个偏移量直接作为run2的长度,和上面是一样的,run2是升序的,run1的最大值在run2中的位置决定了run2的前半部分需要排序,而后半部分不需要排序。
len2 = gallopLeft(a[base1 + len1 - 1], a, base2, len2, len2 - 1, c);
assert len2 >= 0;
if (len2 == 0)
return;
// Merge remaining runs, using tmp array with min(len1, len2) elements
// 取run1和run2较小的部分进行排序,是为了使用额外空间尽可能的小
if (len1 <= len2)
// 使用run1长度的额外空间进行排序,从左向右进行归并排序
mergeLo(base1, len1, base2, len2);
else
// 使用run2长度的额外空间进行排序,这个方法我就不浪费篇幅在这里赘述了,就是使用了run2长度的额外空间,然后从run2的最后一个从右向左进行归并排序
mergeHi(base1, len1, base2, len2);
}
gallopRight方法
// 寻找key在a数组里从base下标开始到长度为len的区间,在启示下标为hint的情况下,最右的插入位置
private static <T> int gallopRight(T key, T[] a, int base, int len,
int hint, Comparator<? super T> c) {
assert len > 0 && hint >= 0 && hint < len;
int ofs = 1;
int lastOfs = 0;
// 如果key小于这个提示的下标位置,则向hint左边开始找,反之从右边开始
// hint的作用就是给我要搜索哪块区域去做一个提示,可以有效的缩短我二分查找key可以插入位置的时间
if (c.compare(key, a[base + hint]) < 0) {
// Gallop left until a[b+hint - ofs] <= key < a[b+hint - lastOfs]
int maxOfs = hint + 1;
// 循环增大偏移量ofs,向左直到找到一个位置key >= a[base + hint - ofs]
while (ofs < maxOfs && c.compare(key, a[base + hint - ofs]) < 0) {
lastOfs = ofs;
ofs = (ofs << 1) + 1;
// 防止int溢出
if (ofs <= 0) // int overflow
ofs = maxOfs;
}
if (ofs > maxOfs)
ofs = maxOfs;
// Make offsets relative to b
int tmp = lastOfs;
lastOfs = hint - ofs;
ofs = hint - tmp;
} else { // a[b + hint] <= key
// Gallop right until a[b+hint + lastOfs] <= key < a[b+hint + ofs]
// 和上边一样,只不过这个是向右找
int maxOfs = len - hint;
while (ofs < maxOfs && c.compare(key, a[base + hint + ofs]) >= 0) {
lastOfs = ofs;
ofs = (ofs << 1) + 1;
if (ofs <= 0) // int overflow
ofs = maxOfs;
}
if (ofs > maxOfs)
ofs = maxOfs;
// Make offsets relative to b
lastOfs += hint;
ofs += hint;
}
assert -1 <= lastOfs && lastOfs < ofs && ofs <= len;
/*
* Now a[b + lastOfs] <= key < a[b + ofs], so key belongs somewhere to
* the right of lastOfs but no farther right than ofs. Do a binary
* search, with invariant a[b + lastOfs - 1] <= key < a[b + ofs].
*/
// 可以看到这里的注释,上边的算法保证了a[b + lastOfs] <= key < a[b + ofs],然后使用二分法找出下标
lastOfs++;
while (lastOfs < ofs) {
int m = lastOfs + ((ofs - lastOfs) >>> 1);
if (c.compare(key, a[base + m]) < 0)
ofs = m; // key < a[b + m]
else
lastOfs = m + 1; // a[b + m] <= key
}
// 所以对这个区间进行二分查找可以得出key可以插入的下标为b + ofs,最终返回偏移量ofs
assert lastOfs == ofs; // so a[b + ofs - 1] <= key < a[b + ofs]
return ofs;
}
gallopLeft方法
// 寻找逻辑基本和gallopRight方法一样,只不过是寻找最左侧的插入位置,这是为了保证排序的稳定性
private static <T> int gallopLeft(T key, T[] a, int base, int len, int hint,
Comparator<? super T> c) {
assert len > 0 && hint >= 0 && hint < len;
int lastOfs = 0;
int ofs = 1;
// 如果启示hint下标的元素小于key,则向右找到一个比它大于或等于的元素
if (c.compare(key, a[base + hint]) > 0) {
// Gallop right until a[base+hint+lastOfs] < key <= a[base+hint+ofs]
int maxOfs = len - hint;
while (ofs < maxOfs && c.compare(key, a[base + hint + ofs]) > 0) {
lastOfs = ofs;
ofs = (ofs << 1) + 1;
if (ofs <= 0) // int overflow
ofs = maxOfs;
}
if (ofs > maxOfs)
ofs = maxOfs;
// Make offsets relative to base
lastOfs += hint;
ofs += hint;
} else { // key <= a[base + hint]
// Gallop left until a[base+hint-ofs] < key <= a[base+hint-lastOfs]
// 向左
final int maxOfs = hint + 1;
while (ofs < maxOfs && c.compare(key, a[base + hint - ofs]) <= 0) {
lastOfs = ofs;
ofs = (ofs << 1) + 1;
if (ofs <= 0) // int overflow
ofs = maxOfs;
}
if (ofs > maxOfs)
ofs = maxOfs;
// Make offsets relative to base
int tmp = lastOfs;
lastOfs = hint - ofs;
ofs = hint - tmp;
}
assert -1 <= lastOfs && lastOfs < ofs && ofs <= len;
/*
* Now a[base+lastOfs] < key <= a[base+ofs], so key belongs somewhere
* to the right of lastOfs but no farther right than ofs. Do a binary
* search, with invariant a[base + lastOfs - 1] < key <= a[base + ofs].
*/
// 同样的使用二分法查找最左侧的适合插入的位置,可以保证排序稳定性
lastOfs++;
while (lastOfs < ofs) {
int m = lastOfs + ((ofs - lastOfs) >>> 1);
if (c.compare(key, a[base + m]) > 0)
lastOfs = m + 1; // a[base + m] < key
else
ofs = m; // key <= a[base + m]
}
assert lastOfs == ofs; // so a[base + ofs - 1] < key <= a[base + ofs]
return ofs;
}
mergeLo方法,真真正正进行排序的方法
private void mergeLo(int base1, int len1, int base2, int len2) {
assert len1 > 0 && len2 > 0 && base1 + len1 == base2;
// Copy first run into temp array
T[] a = this.a; // For performance
// 确认辅助数组的空间长度满足run1的长度
T[] tmp = ensureCapacity(len1);
// 额外空间待排序区间的下标
int cursor1 = tmpBase; // Indexes into tmp array
// run2空间待排序空间的下标
int cursor2 = base2; // Indexes int a
// run1和run2整个已经排序好区间的末尾,或者说是待排序区间的第一个,因为run1和run2在空间上肯定是连续的
int dest = base1; // Indexes int a
// 将run1的所有值赋值给辅助数组
System.arraycopy(a, base1, tmp, cursor1, len1);
// Move first element of second run and deal with degenerate cases
// 根据mergeAt方法的一系列操作,可以保证run2的第一个元素比run1的所有元素要小
a[dest++] = a[cursor2++];
// 如果run2没有了,直接把tmp赋值到剩余的空间后返回
if (--len2 == 0) {
System.arraycopy(tmp, cursor1, a, dest, len1);
return;
}
// 如果run1还有一个元素,根据mergeAt方法的一系列操作,可以保证run1的最后一个元素是最大的元素
if (len1 == 1) {
System.arraycopy(a, cursor2, a, dest, len2);
a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge
return;
}
Comparator<? super T> c = this.c; // Use local variable for performance
// private int minGallop = MIN_GALLOP;
int minGallop = this.minGallop; // " " " " "
// 终于到了真真真正的归并排序
outer:
while (true) {
int count1 = 0; // Number of times in a row that first run won
int count2 = 0; // Number of times in a row that second run won
/*
* Do the straightforward thing until (if ever) one run starts
* winning consistently.
*/
// 进行一个个元素归并排序,其中count1和count2记录连续次数,minGallop的值为7,如果count1为7,说明从tmp数组中连续7个数字都是已经排好序的,很有可能之后也是排好序的,那么就会跳出单个排序的循环,进入飞奔模式
do {
assert len1 > 1 && len2 > 0;
if (c.compare(a[cursor2], tmp[cursor1]) < 0) {
a[dest++] = a[cursor2++];
count2++;
count1 = 0;
if (--len2 == 0)
break outer;
} else {
a[dest++] = tmp[cursor1++];
count1++;
count2 = 0;
if (--len1 == 1)
break outer;
}
} while ((count1 | count2) < minGallop);
/*
* One run is winning so consistently that galloping may be a
* huge win. So try that, and continue galloping until (if ever)
* neither run appears to be winning consistently anymore.
*/
// 飞奔模式,找出已经排好序的区间,使用数组copy的方式可以更快的赋值
do {
assert len1 > 1 && len2 > 0;
// 找到run2的cursor2,在tmp的cursor1之后最右侧的插入位置,该方法已经在上面解释过了
count1 = gallopRight(a[cursor2], tmp, cursor1, len1, 0, c);
if (count1 != 0) {
// 直接使用copy,效率更高
System.arraycopy(tmp, cursor1, a, dest, count1);
dest += count1;
cursor1 += count1;
len1 -= count1;
if (len1 <= 1) // len1 == 1 || len1 == 0
break outer;
}
// 此时cursor2就是最小的元素,直接排到数组里
a[dest++] = a[cursor2++];
if (--len2 == 0)
break outer;
// 再找cursor1在run2里最左侧的插入位置
count2 = gallopLeft(tmp[cursor1], a, cursor2, len2, 0, c);
// 此时在run2里cursor2到count2区间的所有元素是小于tmp[cursor1]
if (count2 != 0) {
// 同样直接使用copy
System.arraycopy(a, cursor2, a, dest, count2);
dest += count2;
cursor2 += count2;
len2 -= count2;
if (len2 == 0)
break outer;
}
a[dest++] = tmp[cursor1++];
if (--len1 == 1)
break outer;
// 适当降低进入飞奔模式的阈值
minGallop--;
// 如果全部小于MIN_GALLOP=7 飞奔模式的最小阈值则退出飞奔模式
} while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
if (minGallop < 0)
minGallop = 0;
// 退出飞奔模式的代价是加2
// 这里我的理解是如果run1和run2中已经排序好的区间比较多由于一些特殊原因退出飞奔模式,那么说它之后也是非常有可能有连续的拍好序的区间,所以上面每次循环都会minGallop--,但是阈值也不能太小所以退出时要加2
minGallop += 2; // Penalize for leaving gallop mode
} // End of "outer" loop
this.minGallop = minGallop < 1 ? 1 : minGallop; // Write back to field
// 到这里排序就基本结束了,从上面可以看出退出的条件是len1等于1或者len2==0
if (len1 == 1) {
// 前面说过len1的最后一个是最大的元素,所以要排在末尾
assert len2 > 0;
System.arraycopy(a, cursor2, a, dest, len2);
a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge
} else if (len1 == 0) {
// len1等于0,在上面看只有飞奔模式中发现,tmp的所有元素都是小于run2,但是在mergeAt方法里又保证了run1最后一个元素是最大的,两次比较结果不一致则说明是你比较器有问题,所以抛出异常
throw new IllegalArgumentException(
"Comparison method violates its general contract!");
} else {
// 这里是len1大于1,len2等于0,说明tmp中剩下的元素也都是都是最大的,直接copy
assert len2 == 0;
assert len1 > 1;
System.arraycopy(tmp, cursor1, a, dest, len1);
}
}
ensureCapacity确认辅助数组的长度满足排序的需求
private T[] ensureCapacity(int minCapacity) {
if (tmpLen < minCapacity) {
// 计算出大于等于minCapacity的最小的2的幂次数
// Compute smallest power of 2 > minCapacity
int newSize = minCapacity;
newSize |= newSize >> 1;
newSize |= newSize >> 2;
newSize |= newSize >> 4;
newSize |= newSize >> 8;
newSize |= newSize >> 16;
newSize++;
if (newSize < 0) // Not bloody likely!
newSize = minCapacity;
else
// 辅助空间的长度不可能超过数组长度的一半
newSize = Math.min(newSize, a.length >>> 1);
@SuppressWarnings({"unchecked", "UnnecessaryLocalVariable"})
T[] newArray = (T[])java.lang.reflect.Array.newInstance
(a.getClass().getComponentType(), newSize);
tmp = newArray;
tmpLen = newSize;
tmpBase = 0;
}
return tmp;
}
mergeHi感兴趣的同学可以自己尝试解读一下这个从右向左归并排序哦
private void mergeHi(int base1, int len1, int base2, int len2) {
assert len1 > 0 && len2 > 0 && base1 + len1 == base2;
// Copy second run into temp array
T[] a = this.a; // For performance
T[] tmp = ensureCapacity(len2);
int tmpBase = this.tmpBase;
System.arraycopy(a, base2, tmp, tmpBase, len2);
int cursor1 = base1 + len1 - 1; // Indexes into a
int cursor2 = tmpBase + len2 - 1; // Indexes into tmp array
int dest = base2 + len2 - 1; // Indexes into a
// Move last element of first run and deal with degenerate cases
a[dest--] = a[cursor1--];
if (--len1 == 0) {
System.arraycopy(tmp, tmpBase, a, dest - (len2 - 1), len2);
return;
}
if (len2 == 1) {
dest -= len1;
cursor1 -= len1;
System.arraycopy(a, cursor1 + 1, a, dest + 1, len1);
a[dest] = tmp[cursor2];
return;
}
Comparator<? super T> c = this.c; // Use local variable for performance
int minGallop = this.minGallop; // " " " " "
outer:
while (true) {
int count1 = 0; // Number of times in a row that first run won
int count2 = 0; // Number of times in a row that second run won
/*
* Do the straightforward thing until (if ever) one run
* appears to win consistently.
*/
do {
assert len1 > 0 && len2 > 1;
if (c.compare(tmp[cursor2], a[cursor1]) < 0) {
a[dest--] = a[cursor1--];
count1++;
count2 = 0;
if (--len1 == 0)
break outer;
} else {
a[dest--] = tmp[cursor2--];
count2++;
count1 = 0;
if (--len2 == 1)
break outer;
}
} while ((count1 | count2) < minGallop);
/*
* One run is winning so consistently that galloping may be a
* huge win. So try that, and continue galloping until (if ever)
* neither run appears to be winning consistently anymore.
*/
do {
assert len1 > 0 && len2 > 1;
count1 = len1 - gallopRight(tmp[cursor2], a, base1, len1, len1 - 1, c);
if (count1 != 0) {
dest -= count1;
cursor1 -= count1;
len1 -= count1;
System.arraycopy(a, cursor1 + 1, a, dest + 1, count1);
if (len1 == 0)
break outer;
}
a[dest--] = tmp[cursor2--];
if (--len2 == 1)
break outer;
count2 = len2 - gallopLeft(a[cursor1], tmp, tmpBase, len2, len2 - 1, c);
if (count2 != 0) {
dest -= count2;
cursor2 -= count2;
len2 -= count2;
System.arraycopy(tmp, cursor2 + 1, a, dest + 1, count2);
if (len2 <= 1) // len2 == 1 || len2 == 0
break outer;
}
a[dest--] = a[cursor1--];
if (--len1 == 0)
break outer;
minGallop--;
} while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
if (minGallop < 0)
minGallop = 0;
minGallop += 2; // Penalize for leaving gallop mode
} // End of "outer" loop
this.minGallop = minGallop < 1 ? 1 : minGallop; // Write back to field
if (len2 == 1) {
assert len1 > 0;
dest -= len1;
cursor1 -= len1;
System.arraycopy(a, cursor1 + 1, a, dest + 1, len1);
a[dest] = tmp[cursor2]; // Move first elt of run2 to front of merge
} else if (len2 == 0) {
throw new IllegalArgumentException(
"Comparison method violates its general contract!");
} else {
assert len1 == 0;
assert len2 > 0;
System.arraycopy(tmp, tmpBase, a, dest - (len2 - 1), len2);
}
}
mergeForceCollapse强制归并的方法
private void mergeForceCollapse() {
while (stackSize > 1) {
int n = stackSize - 2;
if (n > 0 && runLen[n - 1] < runLen[n + 1])
n--;
mergeAt(n);
}
}
至此所有TimSort的排序流程都全部已经讲述完成了,对于我来说了解理论觉得它是神秘的,了解它的实现觉得它是可叹的,再看它的理论觉得豁然开朗和欣喜无比。