De Bruijn Sequences
   HOME

TheInfoList



OR:

In combinatorial
mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
, a de Bruijn sequence of order ''n'' on a size-''k'' alphabet ''A'' is a
cyclic sequence In mathematics, a cyclic order is a way to arrange a set of objects in a circle. Unlike most structures in order theory, a cyclic order is not modeled as a binary relation, such as "". One does not say that east is "more clockwise" than west. Ins ...
in which every possible length-''n''
string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
on ''A'' occurs exactly once as a substring (i.e., as a ''contiguous'' subsequence). Such a sequence is denoted by and has length , which is also the number of distinct strings of length ''n'' on ''A''. Each of these distinct strings, when taken as a substring of , must start at a different position, because substrings starting at the same position are not distinct. Therefore, must have ''at least'' symbols. And since has ''exactly'' symbols, De Bruijn sequences are optimally short with respect to the property of containing every string of length ''n'' at least once. The number of distinct de Bruijn sequences is :\dfrac. The sequences are named after the Dutch mathematician Nicolaas Govert de Bruijn, who wrote about them in 1946. As he later wrote, the existence of de Bruijn sequences for each order together with the above properties were first proved, for the case of alphabets with two elements, by . The generalization to larger alphabets is due to . Automata for recognizing these sequences are denoted as de Bruijn automata. In most applications, ''A'' = .


History

The earliest known example of a de Bruijn sequence comes from Sanskrit prosody where, since the work of Pingala, each possible three-syllable pattern of long and short syllables is given a name, such as 'y' for short–long–long and 'm' for long–long–long. To remember these names, the mnemonic ''yamātārājabhānasalagām'' is used, in which each three-syllable pattern occurs starting at its name: 'yamātā' has a short–long–long pattern, 'mātārā' has a long–long–long pattern, and so on, until 'salagām' which has a short–short–long pattern. This mnemonic, equivalent to a de Bruijn sequence on binary 3-tuples, is of unknown antiquity, but is at least as old as
Charles Philip Brown Charles Philip Brown (10 November 1798 – 12 December 1884) was a British official of the East India Company. He worked in what is now Andhra Pradesh, and became an important scholarly figure in Telugu language literature. Background Telugu li ...
's 1869 book on Sanskrit prosody that mentions it and considers it "an ancient line, written by Pāṇini". In 1894, A. de Rivière raised the question in an issue of the French problem journal '' L'Intermédiaire des Mathématiciens'', of the existence of a circular arrangement of zeroes and ones of size 2^n that contains all 2^n binary sequences of length n. The problem was solved (in the affirmative), along with the count of 2^ distinct solutions, by Camille Flye Sainte-Marie in the same year. This was largely forgotten, and proved the existence of such cycles for general alphabet size in place of 2, with an algorithm for constructing them. Finally, when in 1944
Kees Posthumus Kees Posthumus (16 June 1902 – 15 September 1972) was a Dutch chemist. He was the second rector magnificus of the Eindhoven University of Technology. Biography Kees Posthumus was born in Harlingen, Friesland, as the son of a wholesaler in ...
conjecture In mathematics, a conjecture is a conclusion or a proposition that is proffered on a tentative basis without proof. Some conjectures, such as the Riemann hypothesis (still a conjecture) or Fermat's Last Theorem (a conjecture until proven in 19 ...
d the count 2^ for binary sequences, de Bruijn proved the conjecture in 1946, through which the problem became well-known.
Karl Popper Sir Karl Raimund Popper (28 July 1902 – 17 September 1994) was an Austrian-British philosopher, academic and social commentator. One of the 20th century's most influential philosophers of science, Popper is known for his rejection of the cl ...
independently describes these objects in his '' The Logic of Scientific Discovery'' (1934), calling them "shortest random-like sequences".


Examples

* Taking ''A'' = , there are two distinct ''B''(2, 3): 00010111 and 11101000, one being the reverse or negation of the other. * Two of the 16 possible ''B''(2, 4) in the same alphabet are 0000100110101111 and 0000111101100101. * Two of the 2048 possible ''B''(2, 5) in the same alphabet are 00000100011001010011101011011111 and 00000101001000111110111001101011.


Construction

The de Bruijn sequences can be constructed by taking a Hamiltonian path of an ''n''-dimensional de Bruijn graph over ''k'' symbols (or equivalently, an Eulerian cycle of an (''n'' − 1)-dimensional de Bruijn graph). An alternative construction involves concatenating together, in lexicographic order, all the Lyndon words whose length divides ''n''. An inverse Burrows–Wheeler transform can be used to generate the required Lyndon words in lexicographic order. De Bruijn sequences can also be constructed using shift registers or via finite fields.


Example using de Bruijn graph

Goal: to construct a ''B''(2, 4) de Bruijn sequence of length 24 = 16 using Eulerian (''n'' − 1 = 4 − 1 = 3) 3-D de Bruijn graph cycle. Each edge in this 3-dimensional de Bruijn graph corresponds to a sequence of four digits: the three digits that label the vertex that the edge is leaving followed by the one that labels the edge. If one traverses the edge labeled 1 from 000, one arrives at 001, thereby indicating the presence of the subsequence 0001 in the de Bruijn sequence. To traverse each edge exactly once is to use each of the 16 four-digit sequences exactly once. For example, suppose we follow the following Eulerian path through these vertices: :000, 000, 001, 011, 111, 111, 110, 101, 011, ::110, 100, 001, 010, 101, 010, 100, 000. These are the output sequences of length ''k'': :0 0 0 0 :_ 0 0 0 1 :_ _ 0 0 1 1 This corresponds to the following de Bruijn sequence: :0 0 0 0 1 1 1 1 0 1 1 0 0 1 0 1 The eight vertices appear in the sequence in the following way: 1 1 1 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 0 0 1 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 1 1 1 0 0 1 0 1 0 0 0 0 1 1 1 0 0 1 0 1 0 0 0 0 1 1 1 0 0 1 0 1 0 0 0 0 1 1 1 1 0 1 0 1 0 0 0 0 1 1 1 1 0 1 0 1 0 0 0 0 1 1 1 1 0 1 0 1 0 0 0 0 1 1 1 1 0 1 1 1 0 0 0 0 1 1 1 1 0 1 1 0 0} 0 0 0 1 1 1 1 0 1 1 0 0 0 0 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 0 1 1 0 0 1 0 {1 ... ...and then we return to the starting point. Each of the eight 3-digit sequences (corresponding to the eight vertices) appears exactly twice, and each of the sixteen 4-digit sequences (corresponding to the 16 edges) appears exactly once.


Example using inverse Burrows—Wheeler transform

Mathematically, an inverse Burrows—Wheeler transform on a word generates a multi-set of
equivalence class In mathematics, when the elements of some set S have a notion of equivalence (formalized as an equivalence relation), then one may naturally split the set S into equivalence classes. These equivalence classes are constructed so that elements a ...
es consisting of strings and their rotations. These equivalence classes of strings each contain a Lyndon word as a unique minimum element, so the inverse Burrows—Wheeler transform can be considered to generate a set of Lyndon words. It can be shown that if we perform the inverse Burrows—Wheeler transform on a word consisting of the size-''k'' alphabet repeated ''k''''n''−1 times (so that it will produce a word the same length as the desired de Bruijn sequence), then the result will be the set of all Lyndon words whose length divides ''n''. It follows that arranging these Lyndon words in lexicographic order will yield a de Bruijn sequence ''B''(''k'',''n''), and that this will be the first de Bruijn sequence in lexicographic order. The following method can be used to perform the inverse Burrows—Wheeler transform, using its ''standard permutation'': # Sort the characters in the string , yielding a new string # Position the string above the string , and map each letter's position in to its position in while preserving order. This process defines th
Standard Permutation
# Write this permutation in cycle notation with the smallest position in each cycle first, and the cycles sorted in increasing order. # For each cycle, replace each number with the corresponding letter from string in that position. # Each cycle has now become a Lyndon word, and they are arranged in lexicographic order, so dropping the parentheses yields the first de Bruijn sequence. For example, to construct the smallest ''B''(2,4) de Bruijn sequence of length 24 = 16, repeat the alphabet (ab) 8 times yielding . Sort the characters in , yielding . Position above as shown, and map each element in to the corresponding element in by drawing a line. Number the columns as shown so we can read the cycles of the permutation: Starting from the left, the Standard Permutation notation cycles are: .
Standard Permutation
Then, replacing each number by the corresponding letter in from that column yields: . These are all of the Lyndon words whose length divides 4, in lexicographic order, so dropping the parentheses gives .


Algorithm

The following Python code calculates a de Bruijn sequence, given ''k'' and ''n'', based on an algorithm from Frank Ruskey's ''Combinatorial Generation''. from typing import Iterable, Union, Any def de_bruijn(k: Union terable[Any int">ny.html" ;"title="terable[Any">terable[Any int n: int) -> str: """de Bruijn sequence for alphabet k and subsequences of length n. """ # Two kinds of alphabet input: an integer expands # to a list of integers as the alphabet.. if isinstance(k, int): alphabet = list(map(str, range(k))) else: # While any sort of list becomes used as it is alphabet = k k = len(k) a = [0] * k * n sequence = [] def db(t, p): if t > n: if n % p

0: sequence.extend(a[1 : p + 1]) else: a = a[t - p] db(t + 1, p) for j in range(a - p+ 1, k): a = j db(t + 1, t) db(1, 1) return "".join(alphabet for i in sequence) print(de_bruijn(2, 3)) print(de_bruijn("abcd", 2))
which prints 00010111 aabacadbbcbdccdd Note that these sequences are understood to "wrap around" in a cycle. For example, the first sequence contains 110 and 100 in this fashion.


Uses

De Bruijn cycles are of general use in neuroscience and psychology experiments that examine the effect of stimulus order upon neural systems, and can be specially crafted for use with
functional magnetic resonance imaging Functional magnetic resonance imaging or functional MRI (fMRI) measures brain activity by detecting changes associated with blood flow. This technique relies on the fact that cerebral blood flow and neuronal activation are coupled. When an area o ...
.


Angle detection

The symbols of a de Bruijn sequence written around a circular object (such as a wheel of a robot) can be used to identify its angle by examining the ''n'' consecutive symbols facing a fixed point. This angle-encoding problem is known as the "rotating drum problem".
Gray code The reflected binary code (RBC), also known as reflected binary (RB) or Gray code after Frank Gray, is an ordering of the binary numeral system such that two successive values differ in only one bit (binary digit). For example, the representati ...
s can be used as similar rotary positional encoding mechanisms, a method commonly found in rotary encoders.


Finding least- or most-significant set bit in a word

A de Bruijn sequence can be used to quickly find the index of the least significant set bit ("right-most 1") or the most significant set bit ("left-most 1") in a word using
bitwise operation In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral (considered as a bit string) at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operati ...
s and multiplication. The following example uses a de Bruijn sequence to determine the index of the least significant set bit (equivalent to counting the number of trailing '0' bits) in a 32 bit unsigned integer: uint8_t lowestBitIndex(uint32_t v) { static const uint8_t BitPositionLookup 2= // hash table { 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9 }; return BitPositionLookup (uint32_t)((v & -v) * 0x077CB531U)) >> 27 } The lowestBitIndex() function returns the index of the least-significant set bit in ''v'', or zero if ''v'' has no set bits. The constant 0x077CB531U in the expression is the ''B'' (2, 5) sequence 0000 0111 0111 1100 1011 0101 0011 0001 (spaces added for clarity). The operation (v & -v) zeros all bits except the least-significant bit set, resulting in a new value which is a power of 2. This power of 2 is multiplied (arithmetic modulo 232) by the de Bruijn sequence, thus producing a 32-bit product in which the bit sequence of the 5 MSBs is unique for each power of 2. The 5 MSBs are shifted into the LSB positions to produce a hash code in the range
, 31 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
which is then used as an index into hash table BitPositionLookup. The selected hash table value is the bit index of the least significant set bit in ''v''. The following example determines the index of the most significant bit set in a 32 bit unsigned integer: uint32_t keepHighestBit(uint32_t n) { n , = (n >> 1); n , = (n >> 2); n , = (n >> 4); n , = (n >> 8); n , = (n >> 16); return n - (n >> 1); } uint8_t highestBitIndex(uint32_t v) { static const uint8_t BitPositionLookup 2= { // hash table 0, 1, 16, 2, 29, 17, 3, 22, 30, 20, 18, 11, 13, 4, 7, 23, 31, 15, 28, 21, 19, 10, 12, 6, 14, 27, 9, 5, 26, 8, 25, 24, }; return BitPositionLookup keepHighestBit(v) * 0x06EB14F9U) >> 27 } In the above example an alternative de Bruijn sequence (0x06EB14F9U) is used, with corresponding reordering of array values. The choice of this particular de Bruijn sequence is arbitrary, but the hash table values must be ordered to match the chosen de Bruijn sequence. The keepHighestBit() function zeros all bits except the most-significant set bit, resulting in a value which is a power of 2, which is then processed as in the previous example.


Brute-force attacks on locks

{, class="infobox wikitable collapsible collapsed" style="text-align:center;font-size:80%;line-height:0.5;" , - style="text-align:left;font-size:125%;line-height:1;" , colspan="10", B{10,3} with digits read from top to bottom
then left to right; appending "00" yields
a string to brute-force a 3-digit combination lock , - style="vertical-align:top;" , 0 , rowspan="12", 1 , rowspan="23", 2 , rowspan="34", 3 , rowspan="45", 4 , rowspan="56", 5 , rowspan="67", 6 , rowspan="78", 7 , rowspan="89", 8 , rowspan="99", 9 , - , 001 , - , 002 , - , 003 , - , 004 , - , 005 , - , 006 , - , 007 , - , 008 , - , 009 , - , , - , 011 , - , 012, , 112 , - , 013, , 113 , - , 014, , 114 , - , 015, , 115 , - , 016, , 116 , - , 017, , 117 , - , 018, , 118 , - , 019, , 119 , - , , - , 021 , - , 022, , 122 , - , 023, , 123, , 223 , - , 024, , 124, , 224 , - , 025, , 125, , 225 , - , 026, , 126, , 226 , - , 027, , 127, , 227 , - , 028, , 128, , 228 , - , 029, , 129, , 229 , - , , , rowspan="2", , - , 031 , - , 032, , 132 , - , 033, , 133, , 233 , - , 034, , 134, , 234, , 334 , - , 035, , 135, , 235, , 335 , - , 036, , 136, , 236, , 336 , - , 037, , 137, , 237, , 337 , - , 038, , 138, , 238, , 338 , - , 039, , 139, , 239, , 339 , - , , , rowspan="2", , , rowspan="3", , - , 041 , - , 042, , 142 , - , 043, , 143, , 243 , - , 044, , 144, , 244, , 344 , - , 045, , 145, , 245, , 345, , 445 , - , 046, , 146, , 246, , 346, , 446 , - , 047, , 147, , 247, , 347, , 447 , - , 048, , 148, , 248, , 348, , 448 , - , 049, , 149, , 249, , 349, , 449 , - , , , rowspan="2", , , rowspan="3", , , rowspan="4", , - , 051 , - , 052, , 152 , - , 053, , 153, , 253 , - , 054, , 154, , 254, , 354 , - , 055, , 155, , 255, , 355, , 455 , - , 056, , 156, , 256, , 356, , 456, , 556 , - , 057, , 157, , 257, , 357, , 457, , 557 , - , 058, , 158, , 258, , 358, , 458, , 558 , - , 059, , 159, , 259, , 359, , 459, , 559 , - , , , rowspan="2", , , rowspan="3", , , rowspan="4", , , rowspan="5", , - , 061 , - , 062, , 162 , - , 063, , 163, , 263 , - , 064, , 164, , 264, , 364 , - , 065, , 165, , 265, , 365, , 465 , - , 066, , 166, , 266, , 366, , 466, , 566 , - , 067, , 167, , 267, , 367, , 467, , 567, , 667 , - , 068, , 168, , 268, , 368, , 468, , 568, , 668 , - , 069, , 169, , 269, , 369, , 469, , 569, , 669 , - , , , rowspan="2", , , rowspan="3", , , rowspan="4", , , rowspan="5", , , rowspan="6", , - , 071 , - , 072, , 172 , - , 073, , 173, , 273 , - , 074, , 174, , 274, , 374 , - , 075, , 175, , 275, , 375, , 475 , - , 076, , 176, , 276, , 376, , 476, , 576 , - , 077, , 177, , 277, , 377, , 477, , 577, , 677 , - , 078, , 178, , 278, , 378, , 478, , 578, , 678, , 778 , - , 079, , 179, , 279, , 379, , 479, , 579, , 679, , 779 , - , , , rowspan="2", , , rowspan="3", , , rowspan="4", , , rowspan="5", , , rowspan="6", , , rowspan="7", , - , 081 , - , 082, , 182 , - , 083, , 183, , 283 , - , 084, , 184, , 284, , 384 , - , 085, , 185, , 285, , 385, , 485 , - , 086, , 186, , 286, , 386, , 486, , 586 , - , 087, , 187, , 287, , 387, , 487, , 587, , 687 , - , 088, , 188, , 288, , 388, , 488, , 588, , 688, , 788 , - , 089, , 189, , 289, , 389, , 489, , 589, , 689, , 789, , 889 , - , , , rowspan="2", , , rowspan="3", , , rowspan="4", , , rowspan="5", , , rowspan="6", , , rowspan="7", , , rowspan="8", , - , 091 , - , 092, , 192 , - , 093, , 193, , 293 , - , 094, , 194, , 294, , 394 , - , 095, , 195, , 295, , 395, , 495 , - , 096, , 196, , 296, , 396, , 496, , 596 , - , 097, , 197, , 297, , 397, , 497, , 597, , 697 , - , 098, , 198, , 298, , 398, , 498, , 598, , 698, , 798 , - , 099, , 199, , 299, , 399, , 499, , 599, , 699, , 799, , 899, , (00) A de Bruijn sequence can be used to shorten a brute-force attack on a PIN-like code lock that does not have an "enter" key and accepts the last ''n'' digits entered. For example, a digital door lock with a 4-digit code (each digit having 10 possibilities, from 0 to 9) would have ''B'' (10, 4) solutions, with length . Therefore, only at most (as the solutions are cyclic) presses are needed to open the lock, whereas trying all codes separately would require presses.


f-fold de Bruijn sequences

''f-fold n-ary de Bruijn sequence is an extension of the notion ''n''-ary de Bruijn sequence, such that the sequence of the length fk^n contains every possible subsequence of the length ''n'' exactly ''f'' times. For example, for n=2 the cyclic sequences 11100010 and 11101000 are two-fold binary de Bruijn sequences. The number of two-fold de Bruijn sequences, N_n for n=1 is N_1=2, the other known numbers are N_2=5, N_3=72, and N_4=43768.


De Bruijn torus

A
de Bruijn torus In combinatorial mathematics, a De Bruijn torus, named after Dutch mathematician Nicolaas Govert de Bruijn, is an array of symbols from an alphabet (often just 0 and 1) that contains every possible matrix of given dimensions exactly once. It is ...
is a toroidal array with the property that every ''k''-ary ''m''-by-''n'' matrix occurs exactly once. Such a pattern can be used for two-dimensional positional encoding in a fashion analogous to that described above for rotary encoding. Position can be determined by examining the ''m''-by-''n'' matrix directly adjacent to the sensor, and calculating its position on the de Bruijn torus.


De Bruijn decoding

Computing the position of a particular unique tuple or matrix in a de Bruijn sequence or torus is known as the de Bruijn Decoding Problem. Efficient decoding algorithms exist for special, recursively constructed sequences and extend to the two dimensional case. De Bruijn decoding is of interest, e.g., in cases where large sequences or tori are used for positional encoding.


See also

* Normal number * Linear-feedback shift register * ''n''-sequence * BEST theorem * Superpermutation


Notes


References

* * * * * * * * * * * * * * * * * * * * * * Reprinted in Wardhaugh, Benjamin, ed. (2012), ''A Wealth of Numbers: An Anthology of 500 Years of Popular Mathematics Writing'', Princeton University Press, pp. 139–144. * *


External links

* *
De Bruijn sequence

CGI generator



Javascript generator and decoder
Implementation of J. Tuliani's algorithm.



* http://debruijnsequence.org has many kinds of de Bruijn sequences. {{DEFAULTSORT:Bruijn sequence, de Binary sequences Enumerative combinatorics Articles with example Python (programming language) code