Ehrenfeucht–Mycielski Sequence
   HOME

TheInfoList



OR:

The Ehrenfeucht–Mycielski sequence is a recursively defined sequence of binary digits with
pseudorandom A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic Determinism is a philosophical view, where all events are determined completely by previously exi ...
properties, defined by .


Definition

The sequence starts with the single bit 0; each successive digit is formed by finding the longest suffix of the sequence that also occurs earlier within the sequence, and complementing the bit following the most recent earlier occurrence of that suffix. (The empty string is a suffix and prefix of every string.) For example, the first few steps of this construction process are: #0: initial bit #01: the suffix "" of "0" occurs earlier, most-recently followed by a 0, so add 1 #010: the suffix "" of "01" occurs earlier, most-recently followed by a 1, so add 0 #0100: the suffix "0" of "010" occurs earlier, most-recently followed by a 1, so add 0 #01001: the suffix "0" of "0100" occurs earlier, most-recently followed by a 0, so add 1 #010011: the suffix "01" of "01001" occurs earlier, most-recently followed by a 0, so add 1 #0100110: the suffix "1" of "010011" occurs earlier, most-recently followed by a 1, so add 0 The first few digits of the sequence are:


Algorithms

A naive algorithm that generates each bit in the sequence by comparing each suffix with the entire previous sequence could take as much as O(n^3) time to generate the first n bits of the sequence. However, using a data structure related to a
suffix tree In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particu ...
, the sequence can be generated much more efficiently, in
constant time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
per generated digit.


Universality

The sequence is
disjunctive Disjunctive can refer to: * Disjunctive population, in population ecology, a group of plants or animals disconnected from the rest of its range * Disjunctive pronoun * Disjunctive set * Disjunctive sequence * Logical disjunction In logic, d ...
, meaning that every finite subsequence of bits occurs contiguously, infinitely often within the sequence. More explicitly, the position by which every subsequence of length i can be guaranteed to have occurred at least j times is at most A(4i,j) where A is the
Ackermann function In computability theory, the Ackermann function, named after Wilhelm Ackermann, is one of the simplest and earliest-discovered examples of a total computable function that is not primitive recursive. All primitive recursive functions are total ...
. Experimentally, however, each subsequence appears much earlier in this sequence than this upper bound would suggest: the position by which all length-i sequences occur, up to the limit of experimental testing, is close to the minimum possible value it could be, 2^i+i, the position by which a
de Bruijn sequence In combinatorial mathematics, a de Bruijn sequence of order ''n'' on a size-''k'' alphabet ''A'' is a cyclic sequence in which every possible length-''n'' string on ''A'' occurs exactly once as a substring (i.e., as a ''contiguous'' subseq ...
contains all length-i substrings.


Normality

conjecture that the numbers of 0 and 1 bits each converge to a limiting density of 1/2. That is, if f(i) denotes the number of 0 bits in the first i positions of the Ehrenfeucht–Mycielski sequence, then it should be the case that \lim_ \frac=\frac. More strongly,
I. J. Good Irving John Good (9 December 1916 – 5 April 2009)The Times of 16-apr-09, http://www.timesonline.co.uk/tol/comment/obituaries/article6100314.ece was a British mathematician who worked as a cryptologist at Bletchley Park with Alan Turing. Afte ...
suggests that the convergence rate of this limit should be significantly faster than that of a random binary sequence, for which (by the
law of the iterated logarithm In probability theory, the law of the iterated logarithm describes the magnitude of the fluctuations of a random walk. The original statement of the law of the iterated logarithm is due to A. Ya. Khinchin (1924). Another statement was given by A ...
) \frac=\frac+O\left(\sqrt\right). The Ehrenfeucht–Mycielski balance conjecture suggests that the binary number 0.01001101... (the number having the Ehrenfeucht–Mycielski sequence as its sequence of binary digits after the binary point) may be a
normal number In mathematics, a real number is said to be simply normal in an integer base b if its infinite sequence of digits is distributed uniformly in the sense that each of the b digit values has the same natural density 1/b. A number is said to b ...
in base 2. As of 2009 this conjecture remains unproven; however, it is known that every limit point of the sequence of values f(i)/i lies between 1/4 and 3/4 inclusive.


References


External links

* {{DEFAULTSORT:Ehrenfeucht-Mycielski sequence Binary sequences Pseudorandomness