Collections.sort 有两种重载方式:
1.默认比较器
public static <T extends Comparable<? super T>> void sort(List<T> list)
2.自定义比较器
public static <T> void sort(List<T> list, Comparator<? super T> c)
二者都调用的是同一种方法:
default void sort(Comparator<? super E> c) {
Object[] a = this.toArray();
Arrays.sort(a, (Comparator) c);
ListIterator<E> i = this.listIterator();
for (Object e : a) {
i.next();
i.set((E) e);
}
}
可以看到,方法实际上是把集合先变为Object数组,再调用Arrays.sort()方法进行排序。
进入Arrays.sort()方法:
public static <T> void sort(T[] a, Comparator<? super T> c) {
if (c == null) {
//默认比较器
sort(a);
} else {
//使用传统归并排序,使用指定的比较器
if (LegacyMergeSort.userRequested)
legacyMergeSort(a, c);
else
//使用指定的比较器
TimSort.sort(a, 0, a.length, c, null, 0, 0);
}
}
对于这个语句:
if (LegacyMergeSort.userRequested)
legacyMergeSort(a, c);
实际上就是为了解决兼容,调用jdk1.5的方法进行排序,采用的是冒泡排序和归并排序。详细就不在讲了,几乎用不上,想要了解可以看看源码注释。
设置方法:System.setProperty("java.util.Arrays.useLegacyMergeSort", "true");
再进去默认sort方法:
public static void sort(Object[] a) {
if (LegacyMergeSort.userRequested)
legacyMergeSort(a);
else
ComparableTimSort.sort(a, 0, a.length, null, 0, 0);
}
进入ComparableTimSort.sort我们可以看到关键
private static final int MIN_MERGE = 32;
static void sort(Object[] a, int lo, int hi, Object[] work, int workBase, int workLen) {
assert a != null && lo >= 0 && lo <= hi && hi <= a.length;
int nRemaining = hi - lo;
//数组长度小于等于1,不用排序
if (nRemaining < 2)
return; // Arrays of size 0 and 1 are always sorted
// If array is small, do a "mini-TimSort" with no merges
// 数组太短了,小于32,进行binarySort
if (nRemaining < MIN_MERGE) {
int initRunLen = countRunAndMakeAscending(a, lo, hi);
binarySort(a, lo, hi, lo + initRunLen);
return;
}
/**
* March over the array once, left to right, finding natural runs,
* extending short natural runs to minRun elements, and merging runs
* to maintain stack invariant.
*/
ComparableTimSort ts = new ComparableTimSort(a, work, workBase, workLen);
//选取分块大小,下面会介绍
int minRun = minRunLength(nRemaining);
//TimSort:先扫描找到已经排好的序列,然后再用刚才的mini-TimSort,然后合并
do {
// Identify next run
int runLen = countRunAndMakeAscending(a, lo, hi);
// If run is short, extend to min(minRun, nRemaining)
if (runLen < minRun) {
int force = nRemaining <= minRun ? nRemaining : minRun;
binarySort(a, lo, lo + force, lo + runLen);
runLen = force;
}
// Push run onto pending-run stack, and maybe merge
ts.pushRun(lo, runLen);
ts.mergeCollapse();
// Advance to find next run
lo += runLen;
nRemaining -= runLen;
} while (nRemaining != 0);
// Merge all remaining runs to complete sort
assert lo == hi;
ts.mergeForceCollapse();
assert ts.stackSize == 1;
}
binarySort():
这里直接先讲一下算法思路:
1.从开头先找到一段有序的部分(升序或降序反转)lo-start
2.遍历后半部分要排序的数组,对于每个数据,用二分查找找到应当在前半部分的位置任何插入。
private static void binarySort(Object[] a, int lo, int hi, int start) {
assert lo <= start && start <= hi;
if (start == lo)
start++;
for ( ; start < hi; start++) {
Comparable pivot = (Comparable) a[start];
// Set left (and right) to the index where a[start] (pivot) belongs
int left = lo;
int right = start;
assert left <= right;
/*
* Invariants:
* pivot >= all in [lo, left).
* pivot < all in [right, start).
*/
while (left < right) {
int mid = (left + right) >>> 1;
if (pivot.compareTo(a[mid]) < 0)
right = mid;
else
left = mid + 1;
}
assert left == right;
/*
* The invariants still hold: pivot >= all in [lo, left) and
* pivot < all in [left, start), so pivot belongs at left. Note
* that if there are elements equal to pivot, left points to the
* first slot after them -- that's why this sort is stable.
* Slide elements over to make room for pivot.
*/
int n = start - left; // The number of elements to move
// Switch is just an optimization for arraycopy in default case
switch (n) {
case 2: a[left + 2] = a[left + 1];
case 1: a[left + 1] = a[left];
break;
default: System.arraycopy(a, left, a, left + 1, n);
}
a[left] = pivot;
}
}
minRunLength():
果n< MIN_MERGE,则返回n(它太小了,不用花
哨的东西)。否则,如果n是2的精确幕,则返回 MIN_MERGE/2。否则返回
int k, MIN MERGE/2<=k<= MIN MERGE,这样n/k接近但严格小于2的精确
幕。
/**
* Returns the minimum acceptable run length for an array of the specified
* length. Natural runs shorter than this will be extended with
* {@link #binarySort}.
*
* Roughly speaking, the computation is:
*
* If n < MIN_MERGE, return n (it's too small to bother with fancy stuff).
* Else if n is an exact power of 2, return MIN_MERGE/2.
* Else return an int k, MIN_MERGE/2 <= k <= MIN_MERGE, such that n/k
* is close to, but strictly less than, an exact power of 2.
*
* For the rationale, see listsort.txt.
*
* @param n the length of the array to be sorted
* @return the length of the minimum run to be merged
*/
private static int minRunLength(int n) {
assert n >= 0;
int r = 0; // Becomes 1 if any 1 bits are shifted off
while (n >= MIN_MERGE) {
r |= (n & 1);
n >>= 1;
}
return n + r;
}
总结
:最终数据过小(小于32)使用binarySort(二分+插入),否则使用TimSort(归并+插入)(选择兼容则使用传统归并排序)