选择排序
选择排序可以说是最简单的排序算法了,对含有 n 个元素的序列实现排序的思路是:每次从待排序序列中找出最大值或最小值,查找过程重复 n-1 次。对于每次找到的最大值或最小值,通过交换元素位置的方式将它们放置到适当的位置,最终使整个序列变成有序序列。
重复n-1次选择,每次选择时间复杂度为
O
(
n
)
O(n)
O(n),因此选择排序的时间复杂度为
O
(
n
2
)
O(n^2)
O(n2)。
代码:
/**
*
*********************
* @Title: selectionSort
* @Description: TODO(Selection sort. All data are valid.)
*
*********************
*
*/
public void selectionSort() {
DataNode tempNode;
int tempIndexForSmallest;
for (int i = 0; i < length - 1; i++) {
// Initialize.
tempNode = data[i];
tempIndexForSmallest = i;
for (int j = i + 1; j < length; j++) {
if (data[j].key < tempNode.key) {
tempNode = data[j];
tempIndexForSmallest = j;
} // Of if
} // Of for j
// Change the selected one with the current one.
data[tempIndexForSmallest] = data[i];
data[i] = tempNode;
} // Of for i
}// Of selectionSort
/**
*
*********************
* @Title: selectionSortTest
* @Description: TODO(Test the method.)
*
*********************
*
*/
public static void selectionSortTest() {
int[] tempUnsortedKeys = { 5, 3, 6, 10, 7, 1, 9 };
String[] tempContents = { "if", "then", "else", "switch", "case", "for", "while" };
DataArray tempDataArray = new DataArray(tempUnsortedKeys, tempContents);
System.out.println(tempDataArray);
tempDataArray.selectionSort();
System.out.println("Result\r\n" + tempDataArray);
}// Of selectionSortTest
运行结果:
堆排序
堆排序会用到完全二叉树数组的存储结构,通过数组的索引,我们可以很方便的找到各个结点的父结点或子节点。
大根堆:每个父结点的值都大于其子节点;
小根堆:每个父结点的值都小于其子节点。
在这里,我们讨论大根堆。
不难发现,大根堆的根结点值是当前序列中的最大值。
其是堆排序和选择排序的本质是一样的,都是选择当前未被选中的结点中的最值并放在目标序列的对应位置。
建堆过程:
建好大根堆后,将选出根元素,然后用其他元素去填补位置,再调整堆为大根堆,直到选完所有元素。
代码:
/**
*
*********************
* @Title: adjustHeap
* @Description: TODO(Adjust the heap.)
*
* @param paraStart The start of the index.
* @param paraLength The length of the adjusted sequence.
*********************
*
*/
public void adjustHeap(int paraStart, int paraLength) {
DataNode tempNode = data[paraStart];
int tempParent = paraStart;
int tempKey = data[paraStart].key;
for (int tempChild = paraStart * 2 + 1; tempChild < paraLength; tempChild = tempChild * 2 + 1) {
if (tempChild + 1 < paraLength && data[tempChild].key < data[tempChild + 1].key) {
tempChild++;
} // Of if
System.out.println("The parent position is " + tempParent + " and the child is " + tempChild);
if (tempKey < data[tempChild].key) {
// The child is bigger.
data[tempParent] = data[tempChild];
System.out.println("Move " + data[tempChild].key + " to position " + tempParent);
tempParent = tempChild;
} else {
break;
} // Of if
} // Of for tempChild
data[tempParent] = tempNode;
System.out.println("Adjust " + paraStart + " to " + paraLength + ": " + this);
}// Of adjustHeap
/**
*
*********************
* @Title: heapSort
* @Description: TODO(Heap sort. Maybe the most difficult sorting algorithm.)
*
*********************
*
*/
public void heapSort() {
DataNode tempNode;
// Step 1. Construct the initial heap.
for (int i = length / 2 - 1; i >= 0; i--) {
adjustHeap(i, length);
} // Of for i
System.out.println("The initial heap: " + this + "\r\n");
// Step 2. Swap and reconstruct.
for (int i = length - 1; i > 0; i--) {
tempNode = data[0];
data[0] = data[i];
data[i] = tempNode;
adjustHeap(0, i);
System.out.println("Round " + (length - i) + ": " + this);
} // Of for i
}// Of heapSort
/**
*
*********************
* @Title: heapSortTest
* @Description: TODO(Test the method.)
*
*********************
*
*/
public static void heapSortTest() {
int[] tempUnsortedKeys = { 5, 3, 6, 10, 7, 1, 9 };
String[] tempContents = { "if", "then", "else", "switch", "case", "for", "while" };
DataArray tempDataArray = new DataArray(tempUnsortedKeys, tempContents);
System.out.println(tempDataArray);
tempDataArray.heapSort();
System.out.println("Result\r\n" + tempDataArray);
}// Of heapSortTest
运行结果:
----------heapSortTest----------
I am a data array with 7 items.
(5,if) (3,then) (6,else) (10,switch) (7,case) (1,for) (9,while)
The parent position is 2 and the child is 6
Move 9 to position 2
Adjust 2 to 7: I am a data array with 7 items.
(5,if) (3,then) (9,while) (10,switch) (7,case) (1,for) (6,else)
The parent position is 1 and the child is 3
Move 10 to position 1
Adjust 1 to 7: I am a data array with 7 items.
(5,if) (10,switch) (9,while) (3,then) (7,case) (1,for) (6,else)
The parent position is 0 and the child is 1
Move 10 to position 0
The parent position is 1 and the child is 4
Move 7 to position 1
Adjust 0 to 7: I am a data array with 7 items.
(10,switch) (7,case) (9,while) (3,then) (5,if) (1,for) (6,else)
The initial heap: I am a data array with 7 items.
(10,switch) (7,case) (9,while) (3,then) (5,if) (1,for) (6,else)
The parent position is 0 and the child is 2
Move 9 to position 0
The parent position is 2 and the child is 5
Adjust 0 to 6: I am a data array with 7 items.
(9,while) (7,case) (6,else) (3,then) (5,if) (1,for) (10,switch)
Round 1: I am a data array with 7 items.
(9,while) (7,case) (6,else) (3,then) (5,if) (1,for) (10,switch)
The parent position is 0 and the child is 1
Move 7 to position 0
The parent position is 1 and the child is 4
Move 5 to position 1
Adjust 0 to 5: I am a data array with 7 items.
(7,case) (5,if) (6,else) (3,then) (1,for) (9,while) (10,switch)
Round 2: I am a data array with 7 items.
(7,case) (5,if) (6,else) (3,then) (1,for) (9,while) (10,switch)
The parent position is 0 and the child is 2
Move 6 to position 0
Adjust 0 to 4: I am a data array with 7 items.
(6,else) (5,if) (1,for) (3,then) (7,case) (9,while) (10,switch)
Round 3: I am a data array with 7 items.
(6,else) (5,if) (1,for) (3,then) (7,case) (9,while) (10,switch)
The parent position is 0 and the child is 1
Move 5 to position 0
Adjust 0 to 3: I am a data array with 7 items.
(5,if) (3,then) (1,for) (6,else) (7,case) (9,while) (10,switch)
Round 4: I am a data array with 7 items.
(5,if) (3,then) (1,for) (6,else) (7,case) (9,while) (10,switch)
The parent position is 0 and the child is 1
Move 3 to position 0
Adjust 0 to 2: I am a data array with 7 items.
(3,then) (1,for) (5,if) (6,else) (7,case) (9,while) (10,switch)
Round 5: I am a data array with 7 items.
(3,then) (1,for) (5,if) (6,else) (7,case) (9,while) (10,switch)
Adjust 0 to 1: I am a data array with 7 items.
(1,for) (3,then) (5,if) (6,else) (7,case) (9,while) (10,switch)
Round 6: I am a data array with 7 items.
(1,for) (3,then) (5,if) (6,else) (7,case) (9,while) (10,switch)
Result
I am a data array with 7 items.
(1,for) (3,then) (5,if) (6,else) (7,case) (9,while) (10,switch)
时间复杂度分析:
自下而上的建堆过程:
设
树
高
为
H
,
则
H
=
[
log
2
n
]
+
1
,
则
有
:
第
h
层
的
结
点
个
数
:
2
h
−
1
;
第
h
层
每
个
结
点
的
最
大
比
较
次
数
:
(
H
−
h
)
∗
2
;
第
h
层
所
有
结
点
最
大
比
较
次
数
:
2
h
−
1
∗
(
H
−
h
)
∗
2
=
2
h
∗
(
H
−
h
)
;
总
的
比
较
次
数
:
∑
h
=
1
H
2
h
∗
(
H
−
h
)
;
\begin{aligned} &设树高为H,则H=[\log_2n]+1,则有:\\ &第h层的结点个数:2^{h-1};\\ &第h层每个结点的最大比较次数:(H-h)*2;\\ &第h层所有结点最大比较次数:2^{h-1}*(H-h)*2=2^h*(H-h);\\ &总的比较次数:\sum\limits_{h=1}^H2^h*(H-h); \end{aligned}
设树高为H,则H=[log2n]+1,则有:第h层的结点个数:2h−1;第h层每个结点的最大比较次数:(H−h)∗2;第h层所有结点最大比较次数:2h−1∗(H−h)∗2=2h∗(H−h);总的比较次数:h=1∑H2h∗(H−h);
∑
h
=
1
H
2
h
∗
(
H
−
h
)
=
H
∑
h
=
1
H
2
h
−
∑
h
=
1
H
h
2
h
设
t
=
∑
h
=
1
H
h
2
h
①
,
2
t
=
∑
h
=
1
H
h
2
h
+
1
②
②
−
①
得
:
−
t
=
2
1
+
2
2
+
2
3
+
⋯
+
2
H
−
H
2
H
+
1
得
:
−
t
=
∑
h
=
1
H
2
h
−
H
2
H
+
1
∴
∑
h
=
1
H
2
h
∗
(
H
−
h
)
=
H
∑
h
=
1
H
2
h
−
∑
h
=
1
H
h
2
h
=
(
H
+
1
)
∑
h
=
1
H
2
h
−
H
2
H
+
1
=
(
H
+
1
)
2
(
1
−
2
H
)
1
−
2
−
H
2
H
+
1
=
(
H
+
1
)
2
(
2
H
−
1
)
−
H
2
H
+
1
=
2
(
2
H
−
1
)
−
H
\begin{aligned} &\sum\limits_{h=1}^H2^h*(H-h)=H\sum\limits_{h=1}^H2^h-\sum\limits_{h=1}^H h2^h\\ &设t=\sum\limits_{h=1}^H h2^h\quad ①,2t=\sum\limits_{h=1}^H h2^{h+1}\quad ②\\ &②- ①得:\\ &-t=2^1+2^2+2^3+\cdots+2^H-H2^{H+1}得:\\ &-t=\sum\limits_{h=1}^H2^h-H2^{H+1}\\ &\therefore \sum\limits_{h=1}^H2^h*(H-h)=H\sum\limits_{h=1}^H2^h-\sum\limits_{h=1}^H h2^h\\ &=(H+1)\sum\limits_{h=1}^H2^h-H2^{H+1}\\ &=(H+1)\frac{2(1-2^H)}{1-2}-H2^{H+1}\\ &=(H+1)2(2^H-1)-H2^{H+1}\\ &=2(2^H-1)-H \end{aligned}
h=1∑H2h∗(H−h)=Hh=1∑H2h−h=1∑Hh2h设t=h=1∑Hh2h①,2t=h=1∑Hh2h+1②②−①得:−t=21+22+23+⋯+2H−H2H+1得:−t=h=1∑H2h−H2H+1∴h=1∑H2h∗(H−h)=Hh=1∑H2h−h=1∑Hh2h=(H+1)h=1∑H2h−H2H+1=(H+1)1−22(1−2H)−H2H+1=(H+1)2(2H−1)−H2H+1=2(2H−1)−H
将
H
=
[
log
2
n
]
+
1
H=[\log_2n]+1
H=[log2n]+1带入上面的结果中,易得到自下而上的建堆复杂度为
O
(
n
)
O(n)
O(n)。
在选择最值进行排队时,要进行 n − 1 n-1 n−1趟,每趟还要调节堆,所以时间复杂度为 O ( n log n ) O(n\log n) O(nlogn)。
综上,堆排序的时间复杂度为 O ( n log n ) O(n\log n) O(nlogn)。
但如果是从 n n n个元素中选取 k k k个最大值,那么用选择排序的时间复杂度为 O ( k n ) O(kn) O(kn),用堆排序为 O ( n + k log n ) = max ( O ( n ) , O ( k log n ) ) O(n+k\log n)=\max(O(n),O(k\log n)) O(n+klogn)=max(O(n),O(klogn))。如果用在KNN算法中,对于很大的 n n n来说,选择堆排序算法效果比选择排序算法要好很多。