转载:java sort排序源码分析(TimSort排序)

原文链接:https://blog.csdn.net/tomcosin/article/details/83243455

 

java sort排序源码分析(TimSort排序)

TomCosin 2018-10-25 15:08:35  6155  收藏 1

分类专栏: java

版权

入口:

 
  1. default void sort(Comparator<? super E> c) {

  2. Object[] a = this.toArray();

  3. Arrays.sort(a, (Comparator) c);

  4. ListIterator<E> i = this.listIterator();

  5. for (Object e : a) {

  6. i.next();

  7. i.set((E) e);

  8. }

  9. }

java排序方法调用的Arrays.sort ,传入两个参数,数据数组和comparator对象 

 
  1. public static <T> void sort(T[] a, Comparator<? super T> c) {

  2. if (c == null) {

  3. sort(a);

  4. } else {

  5. if (LegacyMergeSort.userRequested)

  6. legacyMergeSort(a, c);

  7. else

  8. TimSort.sort(a, 0, a.length, c, null, 0, 0);

  9. }

  10. }

在sort方法中,有两种排序算法,传统排序,和TimSort

LegacyMergeSort.userRequested是使用jdk5的传统排序方法。

TimSort是改进后的归并排序,对归并排序在已经反向排好序的输入时表现为O(n^2)的特点做了特别优化。对已经正向排好序的输入减少回溯。对两种情况(一会升序,一会降序)的输入处理比较好(摘自百度百科)。

这里主要讲解TimSort排序

TimSort.sort(a, 0, a.length, c, null, 0, 0);
 
  1. static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c,

  2. T[] work, int workBase, int workLen)

这里传入很多参数:a:数据数组,lo数据第一个元素索引,hi最后一个元素索引,c比较器对象,work工作空间数组,workBase工作空间可用空间,workLen工作集合的大小。

 
  1. static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c,

  2. T[] work, int workBase, int workLen) {

  3. //断言错误情况

  4. assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length;

  5.  
  6. //判断数组长度是否小于2 如果是只有0或1,这种数组通常已经被排序

  7. int nRemaining = hi - lo;

  8. if (nRemaining < 2)

  9. return; // Arrays of size 0 and 1 are always sorted

  10.  
  11. //如果数组长度小于MIN_MERGE(32)则使用二分排序

  12. // If array is small, do a "mini-TimSort" with no merges

  13. if (nRemaining < MIN_MERGE) {

  14. int initRunLen = countRunAndMakeAscending(a, lo, hi, c);

  15. binarySort(a, lo, hi, lo + initRunLen, c);

  16. return;

  17. }

长度小于32时的二分排序

1.数组从头开始寻找顺序片段,直到不满足要求;如果倒序也一直查找直到不满足要求,然后反转。

 
  1. private static <T> int countRunAndMakeAscending(T[] a, int lo, int hi,

  2. Comparator<? super T> c) {

  3. assert lo < hi;

  4. int runHi = lo + 1;

  5. if (runHi == hi)

  6. return 1;

  7. //寻找数组中有序队列

  8. // Find end of run, and reverse range if descending

  9. if (c.compare(a[runHi++], a[lo]) < 0) { // Descending

  10. while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) < 0)

  11. runHi++;

  12. reverseRange(a, lo, runHi);

  13. } else { // Ascending

  14. while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) >= 0)

  15. runHi++;

  16. }

  17. //返回有序片段长度

  18. return runHi - lo;

  19. }

2.使用二分查找来排序

 
  1. private static <T> void binarySort(T[] a, int lo, int hi, int start,

  2. Comparator<? super T> c) {

  3. assert lo <= start && start <= hi;

  4. if (start == lo)

  5. start++;

  6. for ( ; start < hi; start++) {

  7. T pivot = a[start];

  8.  
  9. // Set left (and right) to the index where a[start] (pivot) belongs

  10. int left = lo;

  11. int right = start;

  12. assert left <= right;

  13. //查找到所需插入位置索引

  14. while (left < right) {

  15. int mid = (left + right) >>> 1;

  16. if (c.compare(pivot, a[mid]) < 0)

  17. right = mid;

  18. else

  19. left = mid + 1;

  20. }

  21. assert left == right;

  22. //进行插入(插入位置是1或2时优化)

  23. int n = start - left; // The number of elements to move

  24. // Switch is just an optimization for arraycopy in default case

  25. switch (n) {

  26. case 2: a[left + 2] = a[left + 1];

  27. case 1: a[left + 1] = a[left];

  28. break;

  29. default: System.arraycopy(a, left, a, left + 1, n);

  30. }

  31. a[left] = pivot;

  32. }

  33. }

这个相当于未分片的TimSort 

 长度大于32位时TimSort排序

1.计算出最小分片长度

 
  1. /**

  2. * March over the array once, left to right, finding natural runs,

  3. * extending short natural runs to minRun elements, and merging runs

  4. * to maintain stack invariant.

  5. */

  6. TimSort<T> ts = new TimSort<>(a, c, work, workBase, workLen);

  7. int minRun = minRunLength(nRemaining);

  8. do {

  9. // Identify next run

  10. int runLen = countRunAndMakeAscending(a, lo, hi, c);

  11.  
  12. // If run is short, extend to min(minRun, nRemaining)

  13. if (runLen < minRun) {

  14. int force = nRemaining <= minRun ? nRemaining : minRun;

  15. binarySort(a, lo, lo + force, lo + runLen, c);

  16. runLen = force;

  17. }

  18.  
  19. // Push run onto pending-run stack, and maybe merge

  20. ts.pushRun(lo, runLen);

  21. ts.mergeCollapse();

  22.  
  23. // Advance to find next run

  24. lo += runLen;

  25. nRemaining -= runLen;

  26. } while (nRemaining != 0);

  27.  
  28. // Merge all remaining runs to complete sort

  29. assert lo == hi;

  30. ts.mergeForceCollapse();

  31. assert ts.stackSize == 1;

计算出minRun,当n>=32时除2,直到小于32,(如果n为2的N幂,计算出来为16,否则保留最后五位加最后一次移位的r)

 
  1. private static int minRunLength(int n) {

  2. assert n >= 0;

  3. int r = 0; // Becomes 1 if any 1 bits are shifted off

  4. while (n >= MIN_MERGE) {

  5. //&1之后,n为奇数则为1,偶数为0

  6. r |= (n & 1);

  7. //右移,相当于除2

  8. n >>= 1;

  9. }

  10. return n + r;

  11. }

2.do-while

2.1取得最小升序片段长度(如果是降序则反转),这个方法前面写到过

 
  1. // Identify next run

  2. int runLen = countRunAndMakeAscending(a, lo, hi, c);

2.2如果该长度小于最小分片长度,则用二分查找插入变成满足最小分片长度的升序片段

 
  1. // If run is short, extend to min(minRun, nRemaining)

  2. if (runLen < minRun) {

  3. int force = nRemaining <= minRun ? nRemaining : minRun;

  4. binarySort(a, lo, lo + force, lo + runLen, c);

  5. runLen = force;

  6. }

 2.3将该序列的起始位置和长度入栈

 
  1. private void pushRun(int runBase, int runLen) {

  2. this.runBase[stackSize] = runBase;

  3. this.runLen[stackSize] = runLen;

  4. stackSize++;

  5. }

2.4合并以有有序片段

 
  1. private void mergeCollapse() {

  2. while (stackSize > 1) {

  3. int n = stackSize - 2;

  4. //第一个片段长度小于后两个相加

  5. if (n > 0 && runLen[n-1] <= runLen[n] + runLen[n+1]) {

  6. //如果小于后面第二个长度

  7. if (runLen[n - 1] < runLen[n + 1])

  8. //则将合并位置减一

  9. n--;

  10. mergeAt(n);

  11. } else if (runLen[n] <= runLen[n + 1]) {

  12. mergeAt(n);

  13. } else {

  14. break; // Invariant is established

  15. }

  16. }

  17. }

合并操作,先查出来两个片段边界元素在另外片段的位置

 
  1. private void mergeAt(int i) {

  2. assert stackSize >= 2;

  3. assert i >= 0;

  4. assert i == stackSize - 2 || i == stackSize - 3;

  5. //数据初始化

  6. int base1 = runBase[i];

  7. int len1 = runLen[i];

  8. int base2 = runBase[i + 1];

  9. int len2 = runLen[i + 1];

  10. assert len1 > 0 && len2 > 0;

  11. assert base1 + len1 == base2;

  12.  
  13. /*

  14. * 记录合并后的序列的长度

  15. */

  16. runLen[i] = len1 + len2;

  17. if (i == stackSize - 3) {

  18. runBase[i + 1] = runBase[i + 2];

  19. runLen[i + 1] = runLen[i + 2];

  20. }

  21. stackSize--;

  22.  
  23. /*

  24. * 查找到run2的第一个元素排序在run1的位置

  25. */

  26. int k = gallopRight(a[base2], a, base1, len1, 0, c);

  27. assert k >= 0;

  28. base1 += k;

  29. len1 -= k;

  30. if (len1 == 0)

  31. return;

  32.  
  33. /*

  34. * 查找到run1最后一个元素排序在run2的位置

  35. */

  36. len2 = gallopLeft(a[base1 + len1 - 1], a, base2, len2, len2 - 1, c);

  37. assert len2 >= 0;

  38. if (len2 == 0)

  39. return;

  40.  
  41. //合并操作

  42. // Merge remaining runs, using tmp array with min(len1, len2) elements

  43. if (len1 <= len2)

  44. mergeLo(base1, len1, base2, len2);

  45. else

  46. mergeHi(base1, len1, base2, len2);

  47. }

 找到两个位置之后,则只需归并中间的字段

 

合并方法代码

 
  1. private void mergeLo(int base1, int len1, int base2, int len2) {

  2. assert len1 > 0 && len2 > 0 && base1 + len1 == base2;

  3.  
  4. // Copy first run into temp array

  5. T[] a = this.a; // For performance

  6. T[] tmp = ensureCapacity(len1);

  7. int cursor1 = tmpBase; // Indexes into tmp array

  8. int cursor2 = base2; // Indexes int a

  9. int dest = base1; // Indexes int a

  10. System.arraycopy(a, base1, tmp, cursor1, len1);

  11.  
  12. // Move first element of second run and deal with degenerate cases

  13. a[dest++] = a[cursor2++];

  14. if (--len2 == 0) {

  15. System.arraycopy(tmp, cursor1, a, dest, len1);

  16. return;

  17. }

  18. if (len1 == 1) {

  19. System.arraycopy(a, cursor2, a, dest, len2);

  20. a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge

  21. return;

  22. }

  23.  
  24. Comparator<? super T> c = this.c; // Use local variable for performance

  25. int minGallop = this.minGallop; // " " " " "

  26. outer:

  27. while (true) {

  28. int count1 = 0; // Number of times in a row that first run won

  29. int count2 = 0; // Number of times in a row that second run won

  30.  
  31. /*

  32. * Do the straightforward thing until (if ever) one run starts

  33. * winning consistently.

  34. */

  35. do {

  36. assert len1 > 1 && len2 > 0;

  37. if (c.compare(a[cursor2], tmp[cursor1]) < 0) {

  38. a[dest++] = a[cursor2++];

  39. count2++;

  40. count1 = 0;

  41. if (--len2 == 0)

  42. break outer;

  43. } else {

  44. a[dest++] = tmp[cursor1++];

  45. count1++;

  46. count2 = 0;

  47. if (--len1 == 1)

  48. break outer;

  49. }

  50. } while ((count1 | count2) < minGallop);

  51.  
  52. /*

  53. * One run is winning so consistently that galloping may be a

  54. * huge win. So try that, and continue galloping until (if ever)

  55. * neither run appears to be winning consistently anymore.

  56. */

  57. do {

  58. assert len1 > 1 && len2 > 0;

  59. count1 = gallopRight(a[cursor2], tmp, cursor1, len1, 0, c);

  60. if (count1 != 0) {

  61. System.arraycopy(tmp, cursor1, a, dest, count1);

  62. dest += count1;

  63. cursor1 += count1;

  64. len1 -= count1;

  65. if (len1 <= 1) // len1 == 1 || len1 == 0

  66. break outer;

  67. }

  68. a[dest++] = a[cursor2++];

  69. if (--len2 == 0)

  70. break outer;

  71.  
  72. count2 = gallopLeft(tmp[cursor1], a, cursor2, len2, 0, c);

  73. if (count2 != 0) {

  74. System.arraycopy(a, cursor2, a, dest, count2);

  75. dest += count2;

  76. cursor2 += count2;

  77. len2 -= count2;

  78. if (len2 == 0)

  79. break outer;

  80. }

  81. a[dest++] = tmp[cursor1++];

  82. if (--len1 == 1)

  83. break outer;

  84. minGallop--;

  85. } while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);

  86. if (minGallop < 0)

  87. minGallop = 0;

  88. minGallop += 2; // Penalize for leaving gallop mode

  89. } // End of "outer" loop

  90. this.minGallop = minGallop < 1 ? 1 : minGallop; // Write back to field

  91.  
  92. if (len1 == 1) {

  93. assert len2 > 0;

  94. System.arraycopy(a, cursor2, a, dest, len2);

  95. a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge

  96. } else if (len1 == 0) {

  97. throw new IllegalArgumentException(

  98. "Comparison method violates its general contract!");

  99. } else {

  100. assert len2 == 0;

  101. assert len1 > 1;

  102. System.arraycopy(tmp, cursor1, a, dest, len1);

  103. }

  104. }

这段代码合并代码步骤

2.4.1分配临时片段,用于合并

2.4.2计数count整段合并

注:这里当len1=0抛出异常:Comparison method violates its general contract!,这是在整段合并时,识别到run1有片段应该合并到run2起始位置;但是在合并之前有过判断run1中小于run2第一个元素的片段已经不在合并范围内了,那么合并的run1不可能有片段还在run2的起始值之前(可以看合并的图示更好理解)。所以大家在重写compare方法时需要考虑周全。

以上是对java中timsort排序的一些浅显的解读。

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值