HOME

TheInfoList



OR:

The Ehrenfeucht–Mycielski sequence is a recursively defined sequence of binary digits with
pseudorandom A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process. Background The generation of random numbers has many uses, such as for random ...
properties, defined by .


Definition

The sequence starts with the single bit 0; each successive digit is formed by finding the longest suffix of the sequence that also occurs earlier within the sequence, and complementing the bit following the most recent earlier occurrence of that suffix. (The empty string is a suffix and prefix of every string.) For example, the first few steps of this construction process are: #0: initial bit #01: the suffix "" of "0" occurs earlier, most-recently followed by a 0, so add 1 #010: the suffix "" of "01" occurs earlier, most-recently followed by a 1, so add 0 #0100: the suffix "0" of "010" occurs earlier, most-recently followed by a 1, so add 0 #01001: the suffix "0" of "0100" occurs earlier, most-recently followed by a 0, so add 1 #010011: the suffix "01" of "01001" occurs earlier, most-recently followed by a 0, so add 1 #0100110: the suffix "1" of "010011" occurs earlier, most-recently followed by a 1, so add 0 The first few digits of the sequence are:


Algorithms

A naive algorithm that generates each bit in the sequence by comparing each suffix with the entire previous sequence could take as much as O(n^3) time to generate the first n bits of the sequence. However, using a data structure related to a
suffix tree In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow parti ...
, the sequence can be generated much more efficiently, in
constant time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by t ...
per generated digit.


Universality

The sequence is disjunctive, meaning that every finite subsequence of bits occurs contiguously, infinitely often within the sequence. More explicitly, the position by which every subsequence of length i can be guaranteed to have occurred at least j times is at most A(4i,j) where A is the
Ackermann function