Squarefree Word
   HOME

TheInfoList



OR:

In
combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many appl ...
, a squarefree word is a
word A word is a basic element of language that carries an semantics, objective or pragmatics, practical semantics, meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of w ...
(a sequence of symbols) that does not contain any squares. A square is a word of the form , where is not empty. Thus, a squarefree word can also be defined as a word that avoids the pattern .


Finite squarefree words


Binary alphabet

Over a binary
alphabet An alphabet is a standardized set of basic written graphemes (called letters) that represent the phonemes of certain spoken languages. Not all writing systems represent language in this way; in a syllabary, each character represents a syll ...
\, the only squarefree words are the empty word \epsilon,0,1,01,10,010, and 101.


Ternary alphabet

Over a ternary alphabet ''\'', there are infinitely many squarefree words. It is possible to count the number c(n) of ternary squarefree words of length . This number is bounded by c(n) = \Theta(\alpha^n) , where 1.3017597 < \alpha < 1.3017619 . The upper bound on \alpha can be found via
Fekete's Lemma In mathematics, subadditivity is a property of a function that states, roughly, that evaluating the function for the sum of two element (set), elements of the Domain of a function, domain always returns something less than or equal to the sum of the ...
and approximation by
automata An automaton (; plural: automata or automatons) is a relatively self-operating machine, or control mechanism designed to automatically follow a sequence of operations, or respond to predetermined instructions.Automaton – Definition and More ...
. The lower bound can be found by finding a substitution that preserves squarefreeness.


Alphabet with more than three letters

Since there are infinitely many squarefree words over three-letter alphabets, this implies there are also infinitely many squarefree words over an alphabet with more than three letters. The following table shows the exact growth rate of the -ary squarefree words:


2-dimensional words

Consider a map \textbf from \mathbb^2 to , where is an alphabet and \textbf is called a 2-dimensional word. Let w_ be the entry \textbf(m,n). A word \textbf is a line of \textbf if there exists i_1,i_2,j_1, j_2such that \text(j_1, j_2) = 1, and for t \ge 0, x_t = w_. Carpi proves that there exists a 2-dimensional word \textbf over a 16-letter alphabet such that every line of \textbf is squarefree. A computer search shows that there are no 2-dimensional words \textbfover a 7-letter alphabet, such that every line of \textbf is squarefree.


Generating finite squarefree words

Shur proposes an algorithm called R2F (random-t(w)o-free) that can generate a squarefree word of length over any alphabet with three or more letters. This algorithm is based on a modification of
entropy compression In mathematics and theoretical computer science, entropy compression is an information theory, information theoretic method for proving that a random process terminates, originally used by Robin Moser to prove an Algorithmic Lovász local lemma, alg ...
: it randomly selects letters from a k-letter alphabet to generate a -ary squarefree word. algorithm R2F is input: alphabet size k \ge 2'','' word length ''n > 1'' output: a -ary squarefree word of length . choose w /math> in \Sigma_ uniformly at random set \chi_w to ''w /math>'' followed by all other letters of \Sigma_ in increasing order set the number of iterations to 0 while , w, < n do choose in \Sigma_ uniformly at random append a = \chi_w +1/math> to the end of update \chi_w shifting the first elements to the right and setting \chi_w = a increment by if ends with a square of rank then delete the last letters of return Every (k+1)-ary squarefree word can be the output of Algorithm R2F, because on each iteration it can append any letter except for the last letter of . The expected number of random k-ary letters used by Algorithm R2F to construct a -ary squarefree word of length isN=n\left(1 + \frac 2 + \frac 1 + \frac 4 + O\left(\frac 1 \right)\right)+O(1).Note that there exists an algorithm that can verify the squarefreeness of a word of length in O(n \log n) time. Apostolico and Preparata give an algorithm using suffix trees. Crochemore uses partitioning in his algorithm. Main and Lorentz provide an algorithm based on the divide-and-conquer method. A naive implementation may require ''O(n^2)'' time to verify the squarefreeness of a word of length .


Infinite squarefree words

There exist arbitrarily long squarefree words in any
alphabet An alphabet is a standardized set of basic written graphemes (called letters) that represent the phonemes of certain spoken languages. Not all writing systems represent language in this way; in a syllabary, each character represents a syll ...
with three or more letters, as proved by
Axel Thue Axel Thue (; 19 February 1863 – 7 March 1922) was a Norwegian mathematician, known for his original work in diophantine approximation and combinatorics. Work Thue published his first important paper in 1909. He stated in 1914 the so-called wo ...
.


Examples


First difference of the

Thue–Morse sequence In mathematics, the Thue–Morse sequence, or Prouhet–Thue–Morse sequence, is the binary sequence (an infinite sequence of 0s and 1s) obtained by starting with 0 and successively appending the Boolean complement of the sequence obtained thus ...

One example of an infinite squarefree word over an alphabet of size 3 is the word over the alphabet \ obtained by taking the
first difference In mathematics, a recurrence relation is an equation according to which the nth term of a sequence of numbers is equal to some combination of the previous terms. Often, only k previous terms of the sequence appear in the equation, for a paramete ...
of the
Thue–Morse sequence In mathematics, the Thue–Morse sequence, or Prouhet–Thue–Morse sequence, is the binary sequence (an infinite sequence of 0s and 1s) obtained by starting with 0 and successively appending the Boolean complement of the sequence obtained thus ...
. That is, from the Thue–Morse sequence : 0, 1 ,1, 0 ,1 ,0 ,0 ,1, 1 ,0 ,0 ,1, 0 ,1 ,1, 0 ... one forms a new sequence in which each term is the difference of two consecutive terms of the Thue–Morse sequence. The resulting squarefree word is : 1,0,-1,1,-1,0,1,0,-1,0,1,-1,1,0,-1,....


Leech Leeches are segmented parasitic or predatory worms that comprise the subclass Hirudinea within the phylum Annelida. They are closely related to the oligochaetes, which include the earthworm, and like them have soft, muscular segmented bodie ...
's
morphism In mathematics, particularly in category theory, a morphism is a structure-preserving map from one mathematical structure to another one of the same type. The notion of morphism recurs in much of contemporary mathematics. In set theory, morphisms a ...

Another example found by John Leech is defined recursively over the alphabet \. Let be any squarefree word starting with the letter . Define the words \ recursively as follows: the word w_ is obtained from by replacing each in with , each with , and each with . It is possible to prove that the sequence converges to the infinite squarefree word :


Generating infinite squarefree words

Infinite squarefree words can be generated by squarefree morphism. A morphism is called squarefree if the image of every squarefree word is squarefree. A morphism is called k–squarefree if the image of every squarefree word of length k is squarefree. Crochemore proves that a uniform morphism is squarefree if and only if it is 3-squarefree. In other words, is squarefree if and only if h(w) is squarefree for all squarefree of length 3. It is possible to find a squarefree morphism by
brute-force search In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...
. algorithm squarefree_morphism is output: a squarefree morphism with the lowest possible rank . set k = 3 while True do set ''k_sf_words'' to the list of all squarefree words of length over a ternary alphabet for each h(0) in ''k_sf_words'' do for each h(1) in ''k_sf_words'' do for each h(2) in ''k_sf_words'' do if h(1) = h(2) then break from the current loop (advance to next h(1)) if h(0) \ne h(1) and h(2) \ne h(0) then if h(w) is squarefree for all squarefree of length then return h(0), h(1), h(2) increment by Over a ternary alphabet, there are exactly 144 uniform squarefree morphisms of rank 11 and no uniform squarefree morphisms with a lower rank than 11. To obtain an infinite squarefree words, start with any squarefree word such as , and successively apply a squarefree morphism to it. The resulting words preserve the property of squarefreeness. For example, let be a squarefree morphism, then as w \to \infty, h^(0) is an infinite squarefree word. Note that, if a morphism over a ternary alphabet is not uniform, then this morphism is squarefree if and only if it is 5-squarefree.


Letter combinations in squarefree words


Avoid two-letter combinations

Over a ternary alphabet, a squarefree word of length more than 13 contains all the squarefree two-letter combinations. This can be proved by constructing a squarefree word without the two-letter combination . As a result, ' is the longest squarefree word without the combination and its length is equal to 13. Note that over a more than three-letter alphabet there are squarefree words of any length without an arbitrary two-letter combination.


Avoid three-letter combinations

Over a ternary alphabet, a squarefree word of length more than 36 contains all the squarefree three-letter combinations. However, there are squarefree words of any length without the three-letter combination . Note that over a more than three-letter alphabet there are squarefree words of any length without an arbitrary three-letter combination.


Density of a letter

The density of a letter in a finite word is defined as \frac where , w, _a is the number of occurrences of in w and , w, is the length of the word. The density of a letter in an infinite word is \liminf_\frac where w_l is the prefix of the word of length . The minimal density of a letter in an infinite ternary squarefree word is equal to \frac. The maximum density of a letter in an infinite ternary squarefree word is equal to \frac.


Notes


References

* * . * * {{cite book , last=Pytheas Fogg , first=N. , editor1=Berthé, Valérie, editor1-link=Valérie Berthé, editor2=Ferenczi, Sébastien, editor3=Mauduit, Christian, editor4=Siegel, Anne , title=Substitutions in dynamics, arithmetics and combinatorics , series=Lecture Notes in Mathematics , volume=1794 , location=Berlin , publisher=
Springer-Verlag Springer Science+Business Media, commonly known as Springer, is a German multinational publishing company of books, e-books and peer-reviewed journals in science, humanities, technical and medical (STM) publishing. Originally founded in 1842 in ...
, year=2002 , isbn=978-3-540-44141-0 , zbl=1014.11015 Formal languages Combinatorics on words