堆排序 golang

Heap-sort is a beautiful sorting algorithm. It uses a max-heap to sort a sequence of numbers or other elements with a defined order relation. In this article we’ll deep-dive into the Go standard library heap-sort implementation.

堆排序是一种漂亮的排序算法。 它使用最大堆对具有定义的顺序关系的数字序列或其他元素进行排序。 在本文中,我们将深入研究Go标准库堆排序实现。

最大堆 Max-heap

First a short recap on binary max-heaps. A max-heap is a container that provides its maximum element in O(1) time, adds an element in O(log n), and removes the maximum element in O(log n).

首先简要回顾一下二进制max-heaps 。 max-heap是一个容器,可在O(1)时间内提供其最大元素,在O(log n)中添加一个元素,并在O(log n)中移除最大元素

Max-heaps are almost-full binary trees, where every node is greater or equal to its children. I’ll refer to the latter as the heap property throughout the article.

最大堆是几乎满的二叉树,其中每个节点都大于或等于其子节点 。 在整篇文章中,我将后者称为“堆”属性

Together, these two properties define a max-heap:

这两个属性共同定义了一个最大堆:

i2*i+12*i+2
i2*i+12*i+2

建立一个堆 (Building a heap)

An array can be converted into a max-heap in O(n) time. Amazing, isn’t it? This is the algorithm:

一个数组可以在O(n)时间内转换为最大堆。 太神奇了,不是吗? 这是算法:

  1. Treat the input array as a heap. It doesn’t satisfy the heap property yet. 将输入数组视为堆。 它还不满足堆属性。
  2. Iterate the nodes of the heap starting from the second-to-last level of the heap — that’s one level above the leaves — going back to the root. 从堆的倒数第二个级别(即叶子上方的一个级别)开始,迭代堆的节点,再返回到根。
  3. For each node encountered, propagate it down in the heap, until it is greater than both its children. When propagating down, always swap with the greater child. 对于遇到的每个节点,将其向下传播到堆中,直到大于其两个子节点为止。 向下传播时,请始终与较大的孩子交换。

That’s it! You’re done!

而已! 你完成了!

Why does it work? I’ll try to convince you with this hand-wavy proof (Feel free to skip though):

为什么行得通? 我将尝试用以下手工证明让您信服(不过请随时跳过):

xxxxxxxxx

This is true for all nodes in the heap, including the root.

对于堆中的所有节点(包括根)都是如此。

堆排序算法 Heap-sort algorithm

Now for the main course — heap-sort.

现在是主要课程-堆排序。

Heap-sort works in 2 steps:

堆排序分为两个步骤:

  1. Builds a max-heap from the input array using the algorithm I showed above. This takes O(n) time 使用上面显示的算法,从输入数组构建最大堆。 这需要O(n)时间
  2. Pops elements from the heap into the output array, filling it from the back to the front. Every removal of the maximum element from the heap takes O(log n) time, which adds up to O(n * log n) for the entire container. 将堆中的元素弹出到输出数组中,从后到前填充。 每次从堆中删除最大元素的时间为O(log n),整个容器的总和为O(n * log n)。

A cool property of the Go implementation, is that it uses the input array to store the output, thus saving the need to allocate O(n) memory for the output.

Go实现的一个很酷的特性是,它使用输入数组存储输出,从而节省了为输出分配O(n)内存的需要。

堆排序实现 Heap-sort implementation

The Go sort library supports any collection that is indexed by integers, has a defined order relation on its elements, and supports swapping elements between two indices:

Go排序库支持任何由整数索引的集合,在其元素上具有定义的顺序关系 ,并支持在两个索引之间交换元素:

Naturally, any contiguous container of numbers can satisfy this interface.

自然,任何连续的数字容器都可以满足此接口。

heapSort()
heapSort()

The function’s signature is a bit cryptic, but looking at the first 3 lines clear things up:

该函数的签名有点晦涩难懂,但是看一下前三行,就会发现:

abdataheapSort(data, a, b)data[a, b)abdataheapSort(data, a, b)[a, b)datafirstafirstalohighalohilohighalohi

Next, the code builds the max-heap:

接下来,代码构建最大堆:

// Build heap with greatest element at top.
for i := (hi - 1) / 2; i >= 0; i-- {
  siftDown(data, i, hi, first)
}
siftDown()siftDown()
siftDown()siftDown()
data
data

Next, we pop all the elements to create the sorted array:

接下来,我们弹出所有元素以创建排序后的数组:

// Pop elements, largest first, into end of data.
for i := hi - 1; i >= 0; i-- {
  data.Swap(first, first+i)
  siftDown(data, lo, i, first)
}
i
i
firstfirstfirstfirstii

In other words, we are filling the array from the back to the front, starting from the largest element, to the 2nd element in size all the way to the smallest element. The result is the sorted input.

换句话说,我们从后到前填充数组,从最大的元素到第二个元素,一直到最小的元素。 结果是排序的输入。

维护堆属性 (Maintaining the heap property)

siftDown()
siftDown()
root
root

The first few lines calculate the index of the first child and check that it exists:

前几行计算第一个孩子的索引并检查它是否存在:

child := 2*root + 1
if child >= hi {
  break
}
child >= hiroot
child >= hiroot

Next, we pick the greater of the two children:

接下来,我们选择两个孩子中较大的一个:

if child+1 < hi && data.Less(first+child, first+child+1) {
  child++
}
child++
child++

Next, we check if the parent is indeed smaller than the child:

接下来,我们检查父级是否确实小于子级:

if !data.Less(first+root, first+child) {
  return
}

If the parent is greater than it biggest child, we are done so we return.

如果父母大于最大子女,我们就完成了,所以我们返回。

root
root
data.Swap(first+root, first+child)
root = child
结论 Conclusion

This is the third article where I read an unfamiliar piece of code and try explain it. I like this kind of experience because it teaches me how to read code and how to communicate about it. Please leave your comments and feedback below!

这是第三篇文章,我阅读了一段陌生的代码并尝试对其进行解释。 我喜欢这种经历,因为它教会了我如何阅读代码以及如何进行代码交流。 请在下面留下您的评论和反馈!

堆排序 golang