Adaptive Heap Sort
   HOME

TheInfoList



OR:

In computer science, adaptive heap sort is a comparison-based
sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a List (computing), list into an Total order, order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending ...
of the adaptive sort family. It is a variant of heap sort that performs better when the data contains existing order. Published by Christos Levcopoulos and Ola Petersson in 1992, the algorithm utilizes a new measure of presortedness, ''Osc,'' as the number of oscillations. Instead of putting all the data into the heap as the traditional heap sort did, adaptive heap sort only take part of the data into the heap so that the run time will reduce significantly when the presortedness of the data is high.


Heapsort

Heap sort is a sorting algorithm that utilizes
binary heap A binary heap is a heap (data structure), heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964 as a data structure fo ...
data structure. The method treats an array as a complete
binary tree In computer science, a binary tree is a tree data structure in which each node has at most two children, referred to as the ''left child'' and the ''right child''. That is, it is a ''k''-ary tree with . A recursive definition using set theor ...
and builds up a Max-Heap/Min-Heap to achieve sorting. It usually involves the following four steps. # Build a Max-Heap(Min-Heap): put all the data into the heap so that all nodes are either greater than or equal (less than or equal to for ''Min-Heap'') to each of its child nodes. # Swap the first element of the heap with the last element of the heap. # Remove the last element from the heap and put it at the end of the list. Adjust the heap so that the first element ends up at the right place in the heap. # Repeat Step 2 and 3 until the heap has only one element. Put this last element at the end of the list and output the list. The data in the list will be sorted. Below is a C/C++ implementation that builds up a Max-Heap and sorts the array after the heap is built. /* A C/C++ sample heap sort code that sort an array to an increasing order */ // A function that build up a max-heap binary tree void heapify(int array[], int start, int end) // heap_sort function void heap_sort (int array[], int len) int main()


Measures of presortedness

Measure of presortedness, Measures of presortedness measures the existing order in a given sequence. These measures of presortedness decides the number of data that will be put in to the heap during the sorting process as well as the lower bound of running time.


Oscillations (''Osc'')

For sequence X = \langle x_1, x_2, x_3, \dots ,x_n \rangle, ''Cross''(''xi'') is defined as the number edges of the line plot of ''X'' that are intersected by a horizontal line through the point (''i, xi''). Mathematically, it is defined as \mathit(x_i) = \\text1\leq i \leq n. The oscillation(''Osc'') of X is just the total number of intersections, defined as \mathit(x) = \textstyle \sum_^n \displaystyle\lVert \mathit(x_i) \rVert.


Other measures

Besides the original Osc measurement, other known measures include the number of inversions ''Inv'', the number of runs ''Runs'', the number of blocks ''Block'', and the measures ''Max'', ''Exc'' and ''Rem''. Most of these different measurements are related for adaptive heap sort. Some measures dominate the others: every Osc-optimal algorithm is Inv optimal and Runs optimal; every Inv-optimal algorithm is Max optimal; and every Block-optimal algorithm is Exc optimal and Rem optimal.


Algorithm

Adaptive heap sort is a variant of heap sort that seeks optimality (asymptotically optimal) with respect to the lower bound derived with the measure of presortedness by taking advantage of the existing order in the data. In heap sort, for a data X = \langle x_1, x_2, x_3, \dots ,x_n \rangle , we put all n elements into the heap and then keep extracting the maximum (or minimum) for n times. Since the time of each max-extraction action is the logarithmic in the size of the heap, the total running time of standard heap sort is Big O notation, \color O(n \log n). For adaptive heap sort, instead of putting all the elements into the heap, only the possible maximums of the data (max-candidates) will be put into the heap so that fewer runs are required when each time we try to locate the maximum (or minimum). First, a
Cartesian tree In computer science, a Cartesian tree is a binary tree derived from a sequence of distinct numbers. To construct the Cartesian tree, set its root to be the minimum number in the sequence, and recursively construct its left and right subtrees fro ...
is built from the input in O(n) time by putting the data into a binary tree and making each node in the tree is greater(or smaller) than all its children nodes, and the root of the Cartesian tree is inserted into an empty binary heap. Then repeatedly extract the maximum from the binary heap, retrieve the maximum in the Cartesian tree, and add its left and right children (if any) which are themselves Cartesian trees, to the binary heap. If the input is already nearly sorted, the Cartesian trees will be very unbalanced, with few nodes having left and right children, resulting in the binary heap remaining small, and allowing the algorithm to sort more quickly than O(n\log n) for inputs that are already nearly sorted. Below is an implementation in pseudo-code: Input: an array of n elements that need to be sorted Construct the Cartesian tree ''l''(''x'') Insert the root of ''l''(''x'') into a heap for i = from 1 to n


Drawbacks

Despite decades of research, there's still a gap between the theory of adaptive heap sort and its practical use. Because the algorithm makes use of Cartesian trees and pointer manipulation, it has low cache-efficiency and high memory requirements, both of which deteriorate the performance of implementations.


See also

*
Adaptive sort A sorting algorithm falls into the adaptive sort family if it takes advantage of existing order in its input. It benefits from the presortedness in the input sequence – or a limited amount of disorder for various definitions of measures of diso ...
*
Heapsort In computer science, heapsort is an efficient, comparison-based sorting algorithm that reorganizes an input array into a heap (a data structure where each node is greater than its children) and then repeatedly removes the largest node from that ...
*
Cartesian tree In computer science, a Cartesian tree is a binary tree derived from a sequence of distinct numbers. To construct the Cartesian tree, set its root to be the minimum number in the sequence, and recursively construct its left and right subtrees fro ...


References

{{reflist Comparison sorts Heaps (data structures)