COMP0005-Notes(5): Heap Sort

7 篇文章 0 订阅

Heap

Heap can occur in many forms. If it can be treated as a linear queue, where each element is associated with a priority value. The enqueuing of an element always inserts the element based on its priority value, whereas the dequeuing of the heap always pops the element with the lowest (or highest) priority value.

However the potential of a heap is only maximised if it is treated non-linearly, as in a binary structure, known as the binary heap.

Binary Heap

Normal Binary Tree

If we want to store the priority value in a binary tree, it may look like this:
binary tree example
A binary tree shows both the literal order of the priorities, as well as the order in which the elements (in the same branch) are inserted. For instance in this binary tree, it is quite clear that B is added to the tree after A, as it is a direct child of A.

Binary Heap

In a binary heap, however, the key(priority) values is organised in a different manner (for simplicity we assume a max-oriented binary heap, where the element with maximum priority value is the head. For now we don’t care about how to maintain such structure, we only observe the result):

  1. All layers, except the bottom-most layer, of the binary heap, are filled;
  2. The parent node are be greater than its child node(-s);
  3. New elements are appended from left to right in the bottom-most layer.

Therefore, the values stored in the binary tree above, if now stored in a binary heap, should look like this:
binary heap tree example
(Note that as we only care about the priority values, the question of which element is inserted first actually does not matter to us. So maintaining a structure omitting that feature actually makes more sense in this case.)

Because all the parent layers are filled, and the bottom-most layer is filled from left to right only, we can actually flatten the binary heap into a linear array, as below:
在这里插入图片描述
We only need to keep the structure in mind as we access or insert elements, without an explicit “linking” or pointer, because simple integer arithmetic enables the access of the nodes in an array:

  • Parent of node i is at position i / 2;
  • Children of node i is at postions 2i and 2i+1.

Enqueue()

The enqueuing of an element requires inserting it into the correct position while maintaining the heap structure. However, the process is not complicated:

  1. Append the new element at position N+1(end of the array);
  2. Swim the element up the heap by constantly comparing the node(i) with its parent(i/2) - if the parent is greater then heap structure is maintained, otherwise swap the node and the parent and repeat the compare & swap.

It is quite intuitive that this operation requires at most log(N) + 1 comparisons (the worst case occurs when adding an element to a heap where the bottom-most layer is filled).

Dequeue()

The dequeuing of the heap is straightforward as well - the node to be dequeued is naturally the root node.

The issue that comes after the pop is a bit complicated: We need to maintain the structure of the heap, not only in terms of finding a proper candidate for the parent roles, but also in terms of making sure that all non-bottom layers are filled. For the second reason, the original process of removing a parent node for an ordinary tree (iteratively finding the larger child until no more children) is no longer applicable in this case.

Fortunately, the solution is not complicated as well:

  1. Swap the root node and the last element in the array (0 and N);
  2. Sink the new root node down in the heap by constantly comparing the nodei with the larger child 2i or 2i+1 - if the child is smaller or there is no child then heap structure is maintained, otherwise swap the child and the node and repeat the process.

There are log(N) layers in the heap, and since at each iteration, there is a comparison between the children, followed by another comparison between the larger child and the parent, which makes a total number of 2log(N) comparisons in the worst case (when the node sinks down to the bottom layer again).

Heap Sort

Main Idea

The heuristics behind the heap sort is very simple: first we take the non-heap array and transform it into a heap, and constantly dequeue the root element to construct a sorted array. It can be written as the following sequence of steps:

  1. (Transform into heap) For each node who has a child, starting from the last one N/2, sink it down the array.
  2. (Decompose the heap) Dequeue the root element, but instead of having a new array, we can directly take the space saved from the “shrunk” heap as the space for building our sorted array.
  3. Repeat step 2 until the heap is empty (or one element left).

Analysis

Complexity

For heap transformation, each sink operation takes O(log(h')) time, where h’ is the height of the sub-heap. This adds up to give an overall upper bound of O(N) running time.

For heap decomposition, the number of dequeue() is proportional to the size of the heapN, so the upper bound is given as O(Nlog(N)).

Overall, the heap sort algorithm is bounded by O(Nlog(N)). The linearithmic performance even in the worst case scenario, which is comparable to merge sort.

Benefits

  • In-place;
  • Theoretically fastest even in the worst case.

Drawbacks

  • Not stable… due to the large amount of direct swapping;
  • Practically one of the slowest sorting algorithms, because the array access is in logarithmic steps (all the 2i and i/2), which is not an efficient use of computer cache, especially when the array size is million/billion-level, memory page swapping is needed as frequently as the comparisons, which accumulates to an unaffordable amount.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值