Multi-key Quicksort
   HOME

TheInfoList



OR:

Multi-key quicksort, also known as three-way radix quicksort, is an
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
for
sorting Sorting refers to ordering data in an increasing or decreasing manner according to some linear relationship among the data items. # ordering: arranging items in a sequence ordered by some criterion; # categorizing: grouping items with similar pro ...
strings String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
. This hybrid of
quicksort Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961, it is still a commonly used algorithm for sorting. Overall, it is slightly faster than ...
and
radix sort In computer science, radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distributing elements into buckets according to their radix. For elements with more than one significant digit, this bucketing process i ...
was originally suggested by P. Shackleton, as reported in one of C.A.R. Hoare's seminal papers on
quicksort Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961, it is still a commonly used algorithm for sorting. Overall, it is slightly faster than ...
; its modern incarnation was developed by Jon Bentley and Robert Sedgewick in the mid-1990s. The algorithm is designed to exploit the property that in many problems, strings tend to have shared
prefixes A prefix is an affix which is placed before the Word stem, stem of a word. Adding it to the beginning of one word changes it into another word. For example, when the prefix ''un-'' is added to the word ''happy'', it creates the word ''unhappy'' ...
. One of the algorithm's uses is the construction of
suffix array In computer science, a suffix array is a sorted array of all suffixes of a string. It is a data structure used in, among others, full-text indices, data-compression algorithms, and the field of bibliometrics. Suffix arrays were introduced by as ...
s, for which it was one of the fastest algorithms as of 2004.


Description

The three-way radix quicksort algorithm sorts an array of (
pointers Pointer may refer to: Places * Pointer, Kentucky * Pointers, New Jersey * Pointers Airport, Wasco County, Oregon, United States * The Pointers, a pair of rocks off Antarctica People with the name * Pointer (surname), a surname (including a l ...
to) strings in
lexicographic order In mathematics, the lexicographic or lexicographical order (also known as lexical order, or dictionary order) is a generalization of the alphabetical order of the dictionaries to sequences of ordered symbols or, more generally, of elements of a ...
. It is assumed that all strings are of equal length ; if the strings are of varying length, they must be padded with extra elements that are less than any element in the strings. The pseudocode for the algorithm is then algorithm sort(a : array of string, d : integer) is if length(a) ≤ 1 or d ≥ K then return p := pivot(a, d) i, j := partition(a, d, p) ''(Note a simultaneous assignment of two variables.)'' sort(a[0:i), d) sort(a[i:j), d + 1) sort(a[j:length(a)), d) Unlike most string sorting algorithms that look at many bytes in a string to decide if a string is less than, the same as, or equal to some other string; and then turning its focus to some other pair of strings, the multi-key quicksort initially looks at only one byte of every string in the array, byte d, initially the first byte of every string. The recursive call uses a new value of d and passes a subarray where every string in the subarray has exactly the same initial part -- the characters before character d. The function must return a single character. Bentley and Sedgewick suggest either picking the median of or some random character in that range. The partition function is a variant of the one used in ordinary Quicksort#Repeated elements, three-way quicksort: it rearranges so that all of have an element at position that is less than , have at position , and strings from onward have a 'th element larger than . (The original partitioning function suggested by Bentley and Sedgewick may be slow in the case of repeated elements; a Dutch national flag partitioning can be used to alleviate this.) Practical implementations of multi-key quicksort can benefit from the same optimizations typically applied to quicksort: median-of-three pivoting, switching to
insertion sort Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time by comparisons. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. Howev ...
for small arrays, etc.


See also

*
American flag sort An American flag sort is an efficient, in-place variant of radix sort that distributes items into buckets. Non-comparative sorting algorithms such as radix sort and American flag sort are typically used to sort large objects such as strings, for w ...
another radix sort variant that is fast for string sorting *
Ternary search tree In computer science, a ternary search tree is a type of trie (sometimes called a ''prefix tree'') where nodes are arranged in a manner similar to a binary search tree, but with up to three children rather than the binary tree's limit of two. Lik ...
three-way radix quicksort is
isomorphic In mathematics, an isomorphism is a structure-preserving mapping between two structures of the same type that can be reversed by an inverse mapping. Two mathematical structures are isomorphic if an isomorphism exists between them. The word is ...
to this data structure in the same way that quicksort is isomorphic to binary search trees


Notes


References

{{reflist, 30em Articles with example pseudocode Comparison sorts String sorting algorithms