computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...

, heapsort is a comparison-based

sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a list into an order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending. Efficient sorting is importan ...

. Heapsort can be thought of as an improved

selection sort In computer science, selection sort is an in-place comparison sorting algorithm. It has an O(''n''2) time complexity, which makes it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is n ...

: like selection sort, heapsort divides its input into a sorted and an unsorted region, and it iteratively shrinks the unsorted region by extracting the largest element from it and inserting it into the sorted region. Unlike selection sort, heapsort does not waste time with a linear-time scan of the unsorted region; rather, heap sort maintains the unsorted region in a

heap Heap or HEAP may refer to: Computing and mathematics * Heap (data structure), a data structure commonly used to implement a priority queue * Heap (mathematics), a generalization of a group * Heap (programming) (or free store), an area of memory f ...

data structure to more quickly find the largest element in each step. Although somewhat slower in practice on most machines than a well-implemented

quicksort Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961, it is still a commonly used algorithm for sorting. Overall, it is slightly faster than ...

, it has the advantage of a more favorable worst-case runtime (and as such is used by Introsort as a fallback should it detect that quicksort is becoming degenerate). Heapsort is an in-place algorithm, but it is not a stable sort. Heapsort was invented by

J. W. J. Williams John William Joseph Williams (2 September 1930 – 29 September 2012) was a Welsh-Canadian computer scientist best known for inventing in 1964 heapsort and the binary heap data structure. He was born in Chippenham, Wiltshire''England & Wales, Civ ...

in 1964. This was also the birth of the heap, presented already by Williams as a useful data structure in its own right. In the same year, Robert W. Floyd published an improved version that could sort an array in-place, continuing his earlier research into the treesort algorithm.

Overview

The heapsort algorithm can be divided into two parts. In the first step, a

is built out of the data (see ). The heap is often placed in an array with the layout of a complete binary tree. The complete binary tree maps the binary tree structure into the array indices; each array index represents a node; the index of the node's parent, left child branch, or right child branch are simple expressions. For a zero-based array, the root node is stored at index 0; if i is the index of the current node, then

  iParent(i)     = floor((i-1) / 2) where floor functions map a real number to the largest leading integer.
  iLeftChild(i)  = 2*i + 1
  iRightChild(i) = 2*i + 2

In the second step, a sorted array is created by repeatedly removing the largest element from the heap (the root of the heap), and inserting it into the array. The heap is updated after each removal to maintain the heap property. Once all objects have been removed from the heap, the result is a sorted array. Heapsort can be performed in place. The array can be split into two parts, the sorted array and the heap. The storage of heaps as arrays is diagrammed here. The heap's invariant is preserved after each extraction, so the only cost is that of extraction.

Algorithm

The heapsort algorithm involves preparing the list by first turning it into a

max heap A binary heap is a heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964, as a data structure for heapsort. A binary ...

. The algorithm then repeatedly swaps the first value of the list with the last value, decreasing the range of values considered in the heap operation by one, and sifting the new first value into its position in the heap. This repeats until the range of considered values is one value in length. The steps are: # Call the function on the list. Also referred to as , this builds a heap from a list in operations. # Swap the first element of the list with the final element. Decrease the considered range of the list by one. # Call the function on the list to sift the new first element to its appropriate index in the heap. # Go to step (2) unless the considered range of the list is one element. The operation is run once, and is in performance. The function is , and is called times. Therefore, the performance of this algorithm is .

Pseudocode

The following is a simple way to implement the algorithm in

pseudocode In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine re ...

. Arrays are zero-based and swap is used to exchange two elements of the array. Movement 'down' means from the root towards the leaves, or from lower indices to higher. Note that during the sort, the largest element is at the root of the heap at

a /code>, while at the end of the sort, the largest element is in a nd/code>.

 procedure heapsort(a, count) is
     input: an unordered array ''a'' of length ''count''
  
     ''(Build the heap in array a so that largest value is at the root)''
     heapify(a, count)
 
     ''(The following loop maintains the  invariants that a :endis a heap and every element''
      ''beyond end is greater than everything before it (so a nd:countis in sorted order))''
     end ← count - 1
     while end > 0 do
         ''(a is the root and largest value. The swap moves it in front of the sorted elements.)''
         swap(a nd a 
         ''(the heap size is reduced by one)''
         end ← end - 1
         ''(the swap ruined the heap property, so restore it)''
         siftDown(a, 0, end)

The sorting routine uses two subroutines, heapify and siftDown. The former is the common in-place heap construction routine, while the latter is a common subroutine for implementing heapify.
 
 ''(Put elements of 'a' in heap order, in-place)''
 procedure heapify(a, count) is
     ''(start is assigned the index in 'a' of the last parent node)''
     ''(the last element in a 0-based array is at index count-1; find the parent of that element)''
     start ← iParent(count-1)
     
     while start ≥ 0 do
         ''(sift down the node at index 'start' to the proper place such that all nodes below''
         '' the start index are in heap order)''
         siftDown(a, start, count - 1)
         ''(go to the next parent node)''
         start ← start - 1
     ''(after sifting down the root all nodes/elements are in heap order)''
 
 ''(Repair the heap whose root element is at index 'start', assuming the heaps rooted at its children are valid)''
 procedure siftDown(a, start, end) is
     root ← start
 
     while iLeftChild(root) ≤ end do    ''(While the root has at least one child)''
         child ← iLeftChild(root)   ''(Left child of root)''
         swap ← root                ''(Keeps track of child to swap with)''
 
         if a wap< a hildthen
             swap ← child
         ''(If there is a right child and that child is greater)''
         if child+1 ≤ end and a wap< a hild+1then
             swap ← child + 1
         if swap = root then
             ''(The root holds the largest element. Since we assume the heaps rooted at the''
             '' children are valid, this means that we are done.)''
             return
         else
             Swap(a oot a wap
             root ← swap          ''(repeat to continue sifting down the child now)''

The heapify procedure can be thought of as building a heap from the bottom up by successively sifting downward to establish the  heap property. An alternative version (shown below) that builds the heap top-down and sifts upward may be simpler to understand. This siftUp version can be visualized as starting with an empty heap and successively inserting elements, whereas the siftDown version given above treats the entire input array as a full but "broken" heap and "repairs" it starting from the last non-trivial sub-heap (that is, the last parent node).


Also, the siftDown version of heapify  has  time complexity, while the siftUp version given below has  time complexity due to its equivalence with inserting each element, one at a time, into an empty heap.
This may seem counter-intuitive since, at a glance, it is apparent that the former only makes half as many calls to its logarithmic-time sifting function as the latter; i.e., they seem to differ only by a constant factor, which never affects asymptotic analysis.

To grasp the intuition behind this difference in complexity, note that the number of swaps that may occur during any one  call ''increases'' with the depth of the node on which the call is made. The crux is that there are many (exponentially many) more "deep" nodes than there are "shallow" nodes in a heap, so that siftUp may have its full logarithmic running-time on the approximately linear number of calls made on the nodes at or near the "bottom" of the heap. On the other hand, the number of swaps that may occur during any one siftDown call ''decreases'' as the depth of the node on which the call is made increases. Thus, when the siftDown heapify begins and is calling siftDown on the bottom and most numerous node-layers, each sifting call will incur, at most, a number of swaps equal to the "height" (from the bottom of the heap) of the node on which the sifting call is made. In other words, about half the calls to  will have at most only one swap, then about a quarter of the calls will have at most two swaps, etc.

The heapsort algorithm itself has  time complexity using either version of heapify.

  procedure heapify(a,count) is
      ''(end is assigned the index of the first (left) child of the root)''
      end := 1
      
      while end < count
          ''(sift up the node at index end to the proper place such that all nodes above''
          '' the end index are in heap order)''
          siftUp(a, 0, end)
          end := end + 1
      ''(after sifting up the last node all nodes are in heap order)''
  
  procedure siftUp(a, start, end) is
      input:  ''start represents the limit of how far up the heap to sift.''
                    ''end is the node to sift up.''
      child := end 
      while child > start
          parent := iParent(child)
          if a arent< a hildthen ''(out of max-heap order)''
              swap(a arent a hild
              child := parent ''(repeat to continue sifting up the parent now)''
          else
              return

Note that unlike siftDown approach where, after each swap, you need to call only the siftDown subroutine to fix the broken heap; the siftUp subroutine alone cannot fix the broken heap. The heap needs to be built every time after a swap by calling the heapify procedure since "siftUp" assumes that the element getting swapped ends up in its final place, as opposed to "siftDown" allows for continuous adjustments of items lower in the heap until the invariant is satisfied. The adjusted pseudocode for using siftUp approach is given below.

  procedure heapsort(a, count) is
     input: an unordered array ''a'' of length ''count''
  
     ''(Build the heap in array a so that largest value is at the root)''
     heapify(a, count)
 
     ''(The following loop maintains the  invariants that a :endis a heap and every element''
      ''beyond end is greater than everything before it (so a nd:countis in sorted order))''
     end ← count - 1
     while end > 0 do
         ''(a is the root and largest value. The swap moves it in front of the sorted elements.)''
         swap(a nd a 
         ''(rebuild the heap using siftUp after the swap ruins the heap property)''
         heapify(a, end)
         ''(reduce the heap size by one)''
         end ← end - 1

  Variations 

  Floyd's heap construction 

The most important variation to the basic algorithm, which is included in all practical implementations, is a heap-construction algorithm by Floyd which runs in  time and uses  siftdown rather than  siftup, avoiding the need to implement siftup at all.

Rather than starting with a trivial heap and repeatedly adding leaves, Floyd's algorithm starts with the leaves, observing that they are trivial but valid heaps by themselves, and then adds parents. Starting with element  and working backwards, each internal node is made the root of a valid heap by sifting down. The last step is sifting down the first element, after which the entire array obeys the heap property.

The worst-case number of comparisons during the Floyd's heap-construction phase of heapsort is known to be equal to , where  is the number of 1 bits in the binary representation of  and  is number of trailing 0 bits.

The standard implementation of Floyd's heap-construction algorithm causes a large number of  cache misses once the size of the data exceeds that of the CPU cache 
A CPU cache is a  hardware cache used by the central processing unit (CPU) of a  computer to reduce the average cost (time or energy) to access data from the  main memory. A cache is a smaller, faster memory, located closer to a  processor core, wh ...
. Much better performance on large data sets can be obtained by merging in  depth-first order, combining subheaps as soon as possible, rather than combining all subheaps on one level before proceeding to the one above.Alternate PDF source


  Bottom-up heapsort 

Bottom-up heapsort is a variant which reduces the number of comparisons required by a significant factor. While ordinary heapsort requires  comparisons worst-case and on average, the bottom-up variant requires  comparisons on average, and  in the worst case.

If comparisons are cheap (e.g. integer keys) then the difference is unimportant, as top-down heapsort compares values that have already been loaded from memory. If, however, comparisons require a  function call or other complex logic, then bottom-up heapsort is advantageous.

This is accomplished by improving the siftDown procedure. The change improves the linear-time heap-building phase somewhat, but is more significant in the second phase. Like ordinary heapsort, each iteration of the second phase extracts the top of the heap, , and fills the gap it leaves with , then sifts this latter element down the heap. But this element comes from the lowest level of the heap, meaning it is one of the smallest elements in the heap, so the sift-down will likely take many steps to move it back down. In ordinary heapsort, each step of the sift-down requires two comparisons, to find the minimum of three elements: the new node and its two children.

Bottom-up heapsort instead finds the path of largest children to the leaf level of the tree (as if it were inserting −∞) using only one comparison per level. Put another way, it finds a leaf which has the property that it and all of its ancestors are greater than or equal to their siblings. (In the absence of equal keys, this leaf is unique.) Then, from this leaf, it searches ''upward'' (using one comparison per level) for the correct position in that path to insert . This is the same location as ordinary heapsort finds, and requires the same number of exchanges to perform the insert, but fewer comparisons are required to find that location. Also available as 

Because it goes all the way to the bottom and then comes back up, it is called heapsort with bounce by some authors.

 function leafSearch(a, i, end) is
     j ← i
     while iRightChild(j) ≤ end do
         ''(Determine which of j's two children is the greater)''
         if a RightChild(j)> a LeftChild(j)then
             j ← iRightChild(j)
         else
             j ← iLeftChild(j)
     ''(At the last level, there might be only one child)''
     if iLeftChild(j) ≤ end then
         j ← iLeftChild(j)
     return j

The return value of the leafSearch is used in the modified siftDown routine:

 procedure siftDown(a, i, end) is
     j ← leafSearch(a, i, end)
     while a > a do
         j ← iParent(j)
     x ← a      a ← a      while j > i do
         swap x, a Parent(j)         j ← iParent(j)

Bottom-up heapsort was announced as beating quicksort (with median-of-three pivot selection) on arrays of size ≥16000.

A 2008 re-evaluation of this algorithm showed it to be no faster than ordinary heapsort for integer keys, presumably because modern  branch prediction nullifies the cost of the predictable comparisons which bottom-up heapsort manages to avoid.

A further refinement does a binary search in the path to the selected leaf, and sorts in a worst case of  comparisons, approaching  the information-theoretic lower bound of  comparisons.

A variant which uses two extra bits per internal node (''n''−1 bits total for an ''n''-element heap) to cache information about which child is greater (two bits are required to store three cases: left, right, and unknown) uses less than  compares.

  Other variations 

*Ternary heapsort uses a  ternary heap instead of a binary heap; that is, each element in the heap has three children. It is more complicated to program, but does a constant number of times fewer swap and comparison operations. This is because each sift-down step in a ternary heap requires three comparisons and one swap, whereas in a binary heap two comparisons and one swap are required. Two levels in a ternary heap cover 3² = 9 elements, doing more work with the same number of comparisons as three levels in the binary heap, which only cover 2³ = 8. This is primarily of academic interest, or as a student exercise, as the additional complexity is not worth the minor savings, and bottom-up heapsort beats both.
*Memory-optimized heapsort improves heapsort's locality of reference 

In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
 by increasing the number of children even more. This increases the number of comparisons, but because all children are stored consecutively in memory, reduces the number of cache line 
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which  ...
s accessed during heap traversal, a net performance improvement.
*Out-of-place heapsort improves on bottom-up heapsort by eliminating the worst case, guaranteeing  comparisons. When the maximum is taken, rather than fill the vacated space with an unsorted data value, fill it with a  sentinel value, which never "bounces" back up. It turns out that this can be used as a primitive in an in-place (and non-recursive) "QuickHeapsort" algorithm. First, you perform a quicksort-like partitioning pass, but reversing the order of the partitioned data in the array. Suppose ( without loss of generality) that the smaller partition is the one greater than the pivot, which should go at the end of the array, but our reversed partitioning step places it at the beginning. Form a heap out of the smaller partition and do out-of-place heapsort on it, exchanging the extracted maxima with values from the end of the array. These are less than the pivot, meaning less than any value in the heap, so serve as  sentinel values. Once the heapsort is complete (and the pivot moved to just before the now-sorted end of the array), the order of the partitions has been reversed, and the larger partition at the beginning of the array may be sorted in the same way. (Because there is no non-tail recursion 
In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recur ...
, this also eliminates quicksort's  stack usage.)
*The  smoothsort algorithm is a variation of heapsort developed by Edsger W. Dijkstra 

Edsger Wybe Dijkstra ( ; ; 11 May 1930 – 6 August 2002) was a Dutch computer scientist, programmer, software engineer,  systems scientist, and science essayist. He received the 1972 Turing Award for fundamental contributions to developing progra ...
 in 1981. Like heapsort, smoothsort's upper bound is . The advantage of smoothsort is that it comes closer to  time if the  input is already sorted to some degree, whereas heapsort averages  regardless of the initial sorted state. Due to its complexity, smoothsort is rarely used.
*Levcopoulos and Petersson describe a variation of heapsort based on a heap of  Cartesian trees. First, a Cartesian tree is built from the input in  time, and its root is placed in a 1-element binary heap. Then we repeatedly extract the minimum from the binary heap, output the tree's root element, and add its left and right children (if any) which are themselves Cartesian trees, to the binary heap. As they show, if the input is already nearly sorted, the Cartesian trees will be very unbalanced, with few nodes having left and right children, resulting in the binary heap remaining small, and allowing the algorithm to sort more quickly than  for inputs that are already nearly sorted.
* Several variants such as  weak heapsort require  comparisons in the worst case, close to the theoretical minimum, using one extra bit of state per node. While this extra bit makes the algorithms not truly in-place, if space for it can be found inside the element, these algorithms are simple and efficient, but still slower than binary heaps if key comparisons are cheap enough (e.g. integer keys) that a constant factor does not matter.
* Katajainen's "ultimate heapsort" requires no extra storage, performs  comparisons, and a similar number of element moves. It is, however, even more complex and not justified unless comparisons are very expensive.

  Comparison with other sorts 

Heapsort primarily competes with quicksort 



Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist  Tony Hoare in 1959 and published in 1961, it is still a commonly used algorithm for sorting. Overall, it is slightly faster than ...
, another very efficient general purpose in-place comparison-based sort algorithm.

Heapsort's primary advantages are its simple, non- recursive code, minimal auxiliary storage requirement, and reliably good performance: its best and worst cases are within a small constant factor of each other, and of the  theoretical lower bound on comparison sorts. While it cannot do better than  for pre-sorted inputs, it does not suffer from quicksort's  worst case, either. (The latter can be avoided with careful implementation, but that makes quicksort far more complex, and one of the most popular solutions,  introsort, uses heapsort for the purpose.)

Its primary disadvantages are its poor locality of reference 

In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
 and its inherently serial nature; the accesses to the implicit tree are widely scattered and mostly random, and there is no straightforward way to convert it to a  parallel algorithm.

This makes it popular in embedded system 




An embedded system is a  computer system—a combination of a  computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or  electronic system. It is ''embedded'' ...
s, real-time computing 




Real-time computing (RTC) is the computer science term for  hardware and software systems subject to a "real-time constraint", for example from  event to  system response. Real-time programs must guarantee response within specified time constra ...
, and systems concerned with maliciously chosen inputs, such as the Linux kernel. It is also a good choice for any application which does not expect to be  bottlenecked on sorting.

A well-implemented quicksort is usually 2–3 times faster than heapsort. See particularly figure 9c on p. 98. See Fig. 1 on p. 6. Although quicksort requires fewer comparisons, this is a minor factor. (Results claiming twice as many comparisons are measuring the top-down version; see .) The main advantage of quicksort is its much better locality of reference: partitioning is a linear scan with good spatial locality, and the recursive subdivision has good temporal locality. With additional effort, quicksort can also be implemented in mostly  branch-free code, and multiple CPUs can be used to sort subpartitions in parallel. Thus, quicksort is preferred when the additional performance justifies the implementation effort.

The other major  sorting algorithm is merge sort 


In computer science, merge sort (also commonly spelled as mergesort) is an efficient, general-purpose, and  comparison-based  sorting algorithm. Most implementations produce a  stable sort, which means that the order of equal elements is the same ...
, but that rarely competes directly with heapsort because it is not in-place. Merge sort's requirement for  extra space (roughly half the size of the input) is usually prohibitive except in the situations where merge sort has a clear advantage:
* When a  stable sort is required
* When taking advantage of (partially) pre-sorted input
* Sorting linked list 



In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a  data structure consisting of a collection of  nodes whi ...
s (in which case merge sort requires minimal extra space)
* Parallel sorting; merge sort parallelizes even better than quicksort and can easily achieve close to  linear speedup
* External sorting 
External sorting is a class of sorting  algorithms that can handle massive amounts of data. External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually  RAM) and instead they must reside in ...
; merge sort has excellent locality of reference

  Example 

Let  be the list that we want to sort from the smallest to the largest. (NOTE, for 'Building the Heap' step: Larger nodes don't stay below smaller node parents. They are swapped with parents, and then recursively checked if another swap is needed, to keep larger numbers above smaller numbers on the heap binary tree.)


  Build the heap 




  Sorting 




  Notes 



  References 

* 
* 
* 
* 
*  Chapters 6 and 7 Respectively: Heapsort and Priority Queues

A PDF of Dijkstra's original paper on Smoothsort


by David Carlson, St. Vincent College

  External links 


*  – graphical demonstration


– With text, animations and interactive exercises


*  ttp://www.codecodex.com/wiki/Heapsort Heapsort implemented in 12 languages
Sorting revisited
by Paul Hsieh

A PowerPoint presentation demonstrating how Heap sort works
that is for educators.


  Pat Morin

{{sorting

 Articles with example pseudocode
 Comparison sorts
 Heaps (data structures)
 Sorting algorithms