HOME

TheInfoList



OR:

In mathematics and in particular in combinatorics, the Lehmer code is a particular way to
encode The Encyclopedia of DNA Elements (ENCODE) is a public research project which aims to identify functional elements in the human genome. ENCODE also supports further biomedical research by "generating community resources of genomics data, software ...
each possible
permutation In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or pr ...
of a sequence of ''n'' numbers. It is an instance of a scheme for numbering permutations and is an example of an
inversion Inversion or inversions may refer to: Arts * , a French gay magazine (1924/1925) * ''Inversion'' (artwork), a 2005 temporary sculpture in Houston, Texas * Inversion (music), a term with various meanings in music theory and musical set theory * ...
table. The Lehmer code is named in reference to
Derrick Henry Lehmer Derrick Henry "Dick" Lehmer (February 23, 1905 – May 22, 1991), almost always cited as D.H. Lehmer, was an American mathematician significant to the development of computational number theory. Lehmer refined Édouard Lucas' work in the 1930s an ...
, but the code had been known since 1888 at least.


The code

The Lehmer code makes use of the fact that there are :n!=n\times(n-1)\times\cdots\times2\times1 permutations of a sequence of ''n'' numbers. If a permutation ''σ'' is specified by the sequence (''σ''1, …, ''σ''''n'') of its images of 1, …, ''n'', then it is encoded by a sequence of ''n'' numbers, but not all such sequences are valid since every number must be used only once. By contrast the encodings considered here choose the first number from a set of ''n'' values, the next number from a fixed set of values, and so forth decreasing the number of possibilities until the last number for which only a single fixed value is allowed; ''every'' sequence of numbers chosen from these sets encodes a single permutation. While several
encoding In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...
s can be defined, the Lehmer code has several additional useful properties; it is the sequence :L(\sigma)=(L(\sigma)_1,\ldots,L(\sigma)_n)\quad\text\quad L(\sigma)_i=\#\, in other words the term ''L''(''σ'')''i'' counts the number of terms in (''σ''1, …, ''σ''''n'') to the right of ''σ''''i'' that are smaller than it, a number between 0 and , allowing for different values. A pair of indices (''i'',''j'') with and is called an inversion of ''σ'', and ''L''(''σ'')''i'' counts the number of inversions (''i'',''j'') with ''i'' fixed and varying ''j''. It follows that is the total number of inversions of ''σ'', which is also the number of adjacent transpositions that are needed to transform the permutation into the identity permutation. Other properties of the Lehmer code include that the
lexicographical order In mathematics, the lexicographic or lexicographical order (also known as lexical order, or dictionary order) is a generalization of the alphabetical order of the dictionaries to sequences of ordered symbols or, more generally, of elements of a ...
of the encodings of two permutations is the same as that of their sequences (''σ''1, …, ''σ''''n''), that any value 0 in the code represents a right-to-left minimum in the permutation (i.e., a smaller than any to its right), and a value at position ''i'' similarly signifies a right-to-left maximum, and that the Lehmer code of ''σ'' coincides with the
factorial number system In combinatorics, the factorial number system, also called factoradic, is a mixed radix numeral system adapted to numbering permutations. It is also called factorial base, although factorials do not function as base, but as place value of digi ...
representation of its position in the list of permutations of ''n'' in lexicographical order (numbering the positions starting from 0). Variations of this encoding can be obtained by counting inversions (''i'',''j'') for fixed ''j'' rather than fixed ''i'', by counting inversions with a fixed smaller ''value'' rather than smaller index ''i'', or by counting non-inversions rather than inversions; while this does not produce a fundamentally different type of encoding, some properties of the encoding will change correspondingly. In particular counting inversions with a fixed smaller value gives the inversion table of ''σ'', which can be seen to be the Lehmer code of the inverse permutation.


Encoding and decoding

The usual way to prove that there are ''n''! different permutations of ''n'' objects is to observe that the first object can be chosen in different ways, the next object in different ways (because choosing the same number as the first is forbidden), the next in different ways (because there are now 2 forbidden values), and so forth. Translating this freedom of choice at each step into a number, one obtains an encoding algorithm, one that finds the Lehmer code of a given permutation. One need not suppose the objects permuted to be numbers, but one needs a
total ordering In mathematics, a total or linear order is a partial order in which any two elements are comparable. That is, a total order is a binary relation \leq on some set X, which satisfies the following for all a, b and c in X: # a \leq a ( reflexiv ...
of the set of objects. Since the code numbers are to start from 0, the appropriate number to encode each object ''σ''''i'' by is the number of objects that were available at that point (so they do not occur before position ''i''), but which are smaller than the object ''σ''''i'' actually chosen. (Inevitably such objects must appear at some position , and (''i'',''j'') will be an inversion, which shows that this number is indeed ''L''(''σ'')''i''.) This number to encode each object can be found by direct counting, in several ways (directly counting inversions, or correcting the total number of objects smaller than a given one, which is its sequence number starting from 0 in the set, by those that are unavailable at its position). Another method which is in-place, but not really more efficient, is to start with the permutation of obtained by representing each object by its mentioned sequence number, and then for each entry ''x'', in order from left to right, correct the items to its right by subtracting 1 from all entries (still) greater than ''x'' (to reflect the fact that the object corresponding to ''x'' is no longer available). Concretely a Lehmer code for the permutation B,F,A,G,D,E,C of letters, ordered alphabetically, would first give the list of sequence numbers 1,5,0,6,3,4,2, which is successively transformed : \begin \mathbf1&5&0&6&3&4&2\\ 1&\mathbf4&0&5&2&3&1\\ 1&4&\mathbf0&4&2&3&1\\ 1&4&0&\mathbf3&1&2&0\\ 1&4&0&3&\mathbf1&2&0\\ 1&4&0&3&1&\mathbf1&0\\ 1&4&0&3&1&1&\mathbf0\\ \end where the final line is the Lehmer code (at each line one subtracts 1 from the larger entries to the right of the boldface element to form the next line). For decoding a Lehmer code into a permutation of a given set, the latter procedure may be reversed: for each entry ''x'', in order from right to left, correct the items to its right by adding 1 to all those (currently) greater than or equal to ''x''; finally interpret the resulting permutation of as sequence numbers (which amounts to adding 1 to each entry if a permutation of is sought). Alternatively the entries of the Lehmer code can be processed from left to right, and interpreted as a number determining the next choice of an element as indicated above; this requires maintaining a list of available elements, from which each chosen element is removed. In the example this would mean choosing element 1 from (which is B) then element 4 from (which is F), then element 0 from (giving A) and so on, reconstructing the sequence B,F,A,G,D,E,C.


Applications to combinatorics and probabilities


Independence of relative ranks

The Lehmer code defines a bijection from the
symmetric group In abstract algebra, the symmetric group defined over any set is the group whose elements are all the bijections from the set to itself, and whose group operation is the composition of functions. In particular, the finite symmetric group \ ...
''S''''n'' to the Cartesian product times -1times\cdots\times times /math>, where 'k''designates the ''k''-element set \. As a consequence, under the uniform distribution on ''S''''n'', the component ''L''(''σ'')''i'' defines a uniformly distributed
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
on , and these random variables are mutually
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independe ...
, because they are projections on different factors of a
Cartesian product In mathematics, specifically set theory, the Cartesian product of two sets ''A'' and ''B'', denoted ''A''×''B'', is the set of all ordered pairs where ''a'' is in ''A'' and ''b'' is in ''B''. In terms of set-builder notation, that is : A\tim ...
.


Number of right-to-left minima and maxima

Definition : In a sequence ''u(uk)1≤k≤n'', there is right-to-left minimum (resp. maximum) at rank ''k'' if ''uk'' is strictly smaller (resp. strictly bigger) than each element ''ui'' with ''i>k'', i.e., to its right. Let ''B(k)'' (resp. ''H(k)'') be the event "there is right-to-left minimum (resp. maximum) at rank ''k''", i.e. ''B(k)'' is the set of the permutations \scriptstyle\ \mathfrak_n which exhibit a right-to-left minimum (resp. maximum) at rank ''k''. We clearly have
\\Leftrightarrow\\quad\text\quad\\Leftrightarrow\.
Thus the number ''Nb(ω)'' (resp. ''Nh(ω)'') of right-to-left minimum (resp. maximum) for the permutation ''ω'' can be written as a sum of independent
Bernoulli random variable In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli,James Victor Uspensky: ''Introduction to Mathematical Probability'', McGraw-Hill, New York 1937, page 45 is the discrete probabili ...
s each with a respective parameter of 1/k :
N_b(\omega)=\sum_\ 1\!\!1_\quad\text\quad N_b(\omega)=\sum_\ 1\!\!1_.
Indeed, as ''L(k)'' follows the uniform law on \scriptstyle\ ![1,k!.html" ;"title=",k.html" ;"title="![1,k">![1,k!">,k.html" ;"title="![1,k">![1,k!
\mathbb(B(k))=\mathbb(L(k)=0)=\mathbb(H(k))=\mathbb(L(k)=k-1)=\tfrac1k.
The generating function for the Bernoulli random variable 1\!\!1_ is
G_k(s)=\frack,
therefore the generating function of ''Nb'' is
G(s)=\prod_^nG_k(s)\ =\ \frac
(using the
rising factorial In mathematics, the falling factorial (sometimes called the descending factorial, falling sequential product, or lower factorial) is defined as the polynomial :\begin (x)_n = x^\underline &= \overbrace^ \\ &= \prod_^n(x-k+1) = \prod_^(x-k) \,. \ ...
notation), which allows us to recover the product formula for the generating function of the
Stirling numbers of the first kind In mathematics, especially in combinatorics, Stirling numbers of the first kind arise in the study of permutations. In particular, the Stirling numbers of the first kind count permutations according to their number of cycles (counting fixed po ...
(unsigned).


The secretary problem

This is an optimal stop problem, a classic in decision theory, statistics and applied probabilities, where a random permutation is gradually revealed through the first elements of its Lehmer code, and where the goal is to stop exactly at the element k such as σ(k)=n, whereas the only available information (the k first values of the Lehmer code) is not sufficient to compute σ(k). In less mathematical words : a series of n applicants are interviewed one after the other. The interviewer must hire the best applicant, but must make his decision (“Hire” or “Not hire”) on the spot, without interviewing the next applicant (and ''a fortiori'' without interviewing all applicants). The interviewer thus knows the rank of the kth applicant, therefore, at the moment of making his kth decision, the interviewer knows only the k first elements of the Lehmer code whereas he would need to know all of them to make a well informed decision. To determine the optimal strategies (i.e. the strategy maximizing the probability of a win), the statistical properties of the Lehmer code are crucial. Allegedly,
Johannes Kepler Johannes Kepler (; ; 27 December 1571 – 15 November 1630) was a German astronomer, mathematician, astrologer, natural philosopher and writer on music. He is a key figure in the 17th-century Scientific Revolution, best known for his laws o ...
clearly exposed this
secretary problem The secretary problem demonstrates a scenario involving optimal stopping theory For French translation, secover storyin the July issue of ''Pour la Science'' (2009). that is studied extensively in the fields of applied probability, statistics, an ...
to a friend of his at a time when he was trying to make up his mind and choose one out eleven prospective brides as his second wife. His first marriage had been an unhappy one, having been arranged without himself being consulted, and he was thus very concerned that he could reach the right decision.


Similar concepts

Two similar vectors are in use. One of them is often called inversion vector, e.g. by
Wolfram Alpha WolframAlpha ( ) is an answer engine developed by Wolfram Research. It answers factual queries by computing answers from externally sourced data. WolframAlpha was released on May 18, 2009 and is based on Wolfram's earlier product Wolfram Mathe ...
. See also .


References


Bibliography

* *{{ Citation , last1=Knuth , first1=Don , title=The art of computer programming , volume=3 , publisher=Addison-Wesley , place=Reading , year=1981 , pages=12–13 Combinatorics Permutations Resampling (statistics)