Unavoidable Pattern
   HOME

TheInfoList



OR:

In
mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
and
theoretical computer science Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory. It is difficult to circumsc ...
, a pattern is an unavoidable pattern if it is unavoidable on any finite alphabet.


Definitions


Pattern

Like a word, a pattern (also called ''term'') is a sequence of symbols over some
alphabet An alphabet is a standardized set of basic written graphemes (called letters) that represent the phonemes of certain spoken languages. Not all writing systems represent language in this way; in a syllabary, each character represents a syll ...
. The minimum multiplicity of the pattern ''p'' is ''m(p)=\min(\mathrm(x):x \in p)'' where ''\mathrm(x)'' is the number of occurrence of symbol ''x'' in pattern ''p''. In other words, it is the number of occurrences in ''p'' of the least frequently occurring symbol in ''p''.


Instance

Given finite alphabets \Sigma and \Delta, a word x\in\Sigma^* is an instance of the pattern p\in\Delta^* if there exists a non-erasing semigroup morphism f:\Delta^*\rightarrow\Sigma^* such that f(p)=x, where \Sigma^* denotes the
Kleene star In mathematical logic and computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation, either on sets of strings or on sets of symbols or characters. In mathematics, it is more commonly known as the free monoid ...
of \Sigma. Non-erasing means that f(a)\neq\varepsilon for all a\in\Delta, where \varepsilon denotes the
empty string In formal language theory, the empty string, or empty word, is the unique string of length zero. Formal theory Formally, a string is a finite, ordered sequence of characters such as letters, digits or spaces. The empty string is the special cas ...
.


Avoidance / Matching

A word w is said to ''match'', or ''encounter'', a pattern p if a factor (also called ''subword'' or ''
substring In formal language theory and computer science, a substring is a contiguous sequence of characters within a string. For instance, "''the best of''" is a substring of "''It was the best of times''". In contrast, "''Itwastimes''" is a subsequenc ...
'') of w is an instance of p. Otherwise, w is said to avoid p, or to be p-free. This definition can be generalized to the case of an infinite w, based on a generalized definition of "substring".


Avoidability / Unavoidability on a specific alphabet

A pattern p is ''unavoidable'' on a finite alphabet \Sigma if each sufficiently long word x\in\Sigma^* must match p; formally: if \exists n\in \Nu. \ \forall x\in\Sigma^*. \ (, x, \geq n \implies x \text p). Otherwise, p is ''avoidable'' on \Sigma, which implies there exist infinitely many words over the alphabet \Sigma that avoid p. By
Kőnig's lemma Kőnig's lemma or Kőnig's infinity lemma is a theorem in graph theory due to the Hungarian mathematician Dénes Kőnig who published it in 1927. It gives a sufficient condition for an infinite graph to have an infinitely long path. The computab ...
, pattern p is avoidable on \Sigma
if and only if In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicondi ...
there exists an
infinite word In formal language theory within theoretical computer science, an infinite word is an infinite-length sequence (specifically, an ω-length sequence) of symbols, and an ω-language is a set of infinite words. Here, ω refers to the first ordinal ...
w\in \Sigma^\omega that avoids p.


Maximal p-free word

Given a pattern p and an alphabet ''\Sigma''. A p-free word ''w\in\Sigma^*'' is a maximal p-free word over ''\Sigma'' if aw and wa match p \forall a\in\Sigma.


Avoidable / Unavoidable pattern

A pattern ''p'' is an unavoidable pattern (also called ''blocking term'') if ''p'' is unavoidable on any finite alphabet. If a pattern is unavoidable and not limited to a specific alphabet, then it is unavoidable for any finite alphabet by default. Conversely, if a pattern is said to be avoidable and not limited to a specific alphabet, then it is avoidable on some finite alphabet by default.


''k''-avoidable / ''k''-unavoidable

A pattern ''p'' is ''k''-avoidable if ''p'' is avoidable on an alphabet ''\Sigma'' of size ''k''. Otherwise, ''p'' is ''k''-unavoidable, which means ''p'' is unavoidable on every alphabet of size ''k''. If pattern p is k-avoidable, then p is g-avoidable for all ''g\geq k''. Given a finite set of avoidable patterns S=\, there exists an infinite word w\in\Sigma^\omega such that w avoids all patterns of S. Let \mu(S) denote the size of the minimal alphabet \Sigma'such that \exist w' \in ^\omega avoiding all patterns of S.


Avoidability index

The avoidability index of a pattern ''p'' is the smallest ''k'' such that ''p'' is ''k''-avoidable, and ''\infin'' if ''p'' is unavoidable.


Properties

*A pattern ''q ''is avoidable if ''q ''is an instance of an avoidable pattern ''p''. *Let avoidable pattern ''p'' be a factor of pattern ''q'', then ''q'' is also avoidable. *A pattern ''q'' is unavoidable if and only if ''q'' is a factor of some unavoidable pattern ''p''. *Given an unavoidable pattern ''p'' and a symbol ''a'' not in ''p'', then ''pap'' is unavoidable. *Given an unavoidable pattern ''p'', then the reversal ''p^R'' is unavoidable. *Given an unavoidable pattern ''p'', there exists a symbol ''a'' such that ''a'' occurs exactly once in ''p''. *Let n\in\Nu represent the number of distinct symbols of pattern p. If , p, \geq2^n, then p is avoidable.


Zimin words

Given alphabet \Delta=\, Zimin words (patterns) are defined recursively Z_=Z_nx_Z_n for n\in\Zeta^+ and Z_1=x_1.


Unavoidability

All Zimin words are unavoidable. A word ''w'' is unavoidable if and only if it is a factor of a Zimin word. Given a finite alphabet \Sigma, let f(n,, \Sigma, ) represent the smallest m\in\Zeta^+ such that w matches Z_n for all w\in \Sigma^m. We have following properties: *f(1,q)=1 *f(2,q)=2q+1 *f(3,2)=29 *f(n,q)\leq=\underbrace_ Z_n is the longest unavoidable pattern constructed by alphabet \Delta=\ since , Z_n, =2^n -1.


Pattern reduction


Free letter

Given a pattern ''p'' over some alphabet \Delta, we say x\in\Delta is free for ''p'' if there exist subsets A,B of \Delta such that the following hold: #uv is a factor of ''p'' and ''u\in A'' ↔ uv is a factor of ''p'' and ''v\in B'' #''x\in A\backslash B\cup B\backslash A'' For example, let ''p=abcbab'', then ''b'' is free for ''p'' since there exist ''A=ac,B=b'' satisfying the conditions above.


Reduce

A pattern ''p\in \Delta^*'' reduces to pattern ''q'' if there exists a symbol ''x\in \Delta'' such that ''x'' is free for ''p'', and ''q'' can be obtained by removing all occurrence of ''x'' from ''p''. Denote this relation by p\stackrelq. For example, let ''p=abcbab'', then ''p'' can reduce to ''q=aca'' since ''b'' is free for ''p''.


Locked

A word ''w'' is said to be locked if ''w'' has no free letter; hence ''w'' can not be reduced.


Transitivity

Given patterns ''p,q,r'', if ''p'' reduces to ''q'' and ''q'' reduces to ''r'', then ''p'' reduces to ''r''. Denote this relation by p\stackrelr.


Unavoidability

A pattern ''p'' is unavoidable if and only if ''p'' reduces to a word of length one; hence ''\exist w'' such that '', w, =1'' and ''p\stackrelw''.


Graph pattern avoidance


Avoidance / Matching on a specific graph

Given a simple
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
''G=(V,E)'', a edge coloring ''c:E\rightarrow \Delta'' matches pattern ''p'' if there exists a simple
path A path is a route for physical travel – see Trail. Path or PATH may also refer to: Physical paths of different types * Bicycle path * Bridle path, used by people on horseback * Course (navigation), the intended path of a vehicle * Desire p ...
''P= _1,e_2,...,e_r/math>'' in ''G'' such that the sequence ''c(P)= (e_1),c(e_2),...,c(e_r)/math>'' matches ''p''. Otherwise, ''c'' is said to avoid ''p'' or be ''p''-free. Similarly, a vertex coloring ''c:V\rightarrow \Delta'' matches pattern ''p'' if there exists a simple
path A path is a route for physical travel – see Trail. Path or PATH may also refer to: Physical paths of different types * Bicycle path * Bridle path, used by people on horseback * Course (navigation), the intended path of a vehicle * Desire p ...
''P= _1,c_2,...,c_r/math>'' in ''G'' such that the sequence ''c(P)'' matches ''p''.


Pattern chromatic number

The pattern chromatic number \pi_p(G) is the minimal number of distinct colors needed for a ''p''-free vertex coloring ''c'' over the graph ''G''. Let ''\pi_p(n)=\max\'' where ''G_n'' is the set of all simple graphs with a maximum
degree Degree may refer to: As a unit of measurement * Degree (angle), a unit of angle measurement ** Degree of geographical latitude ** Degree of geographical longitude * Degree symbol (°), a notation used in science, engineering, and mathematics ...
no more than ''n''. Similarly, \pi_p'(G) and \pi_p'(n) are defined for edge colorings.


Avoidability / Unavoidability on graphs

A pattern ''p'' is avoidable on graphs if ''\pi_p(n)'' is bounded by ''c_p'', where ''c_p'' only depends on ''p''. * Avoidance on words can be expressed as a specific case of avoidance on graphs; hence a pattern p is avoidable on any finite alphabet if and only if \pi_p(P_n) \leq c_p for all n \in\Zeta ^+ , where P_n is a graph of n vertices concatenated.


Probabilistic bound on ''\pi_p(n)''

There exists an absolute constant c, such that ''\pi_p(n)\leq cn^\leq cn^2'' for all patterns ''p'' with ''m(p)\geq2''. Given a pattern p, let n represent the number of distinct symbols of p. If , p, \geq2^n, then p is avoidable on graphs.


Explicit colorings

Given a pattern p such that ''count_p(x)'' is even for all ''x\in p'', then \pi_p'(K_2^k)\leq2^k-1 for all ''k\geq 1'', where ''K_n'' is the
complete graph In the mathematical field of graph theory, a complete graph is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge. A complete digraph is a directed graph in which every pair of distinct vertices is ...
of n vertices. Given a pattern p such that ''m(p)\geq2'', and an arbitrary
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
T, let S be the set of all avoidable subpatterns and their reflections of p. Then \pi_p(T)\leq 3\mu(S). Given a pattern p such that ''m(p)\geq2'', and a
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
T with degree ''n\geq2''. Let S be the set of all avoidable subpatterns and their reflections of p, then \pi_p'(T)\leq 2(n-1)\mu(S).


Examples

* The
Thue–Morse sequence In mathematics, the Thue–Morse sequence, or Prouhet–Thue–Morse sequence, is the binary sequence (an infinite sequence of 0s and 1s) obtained by starting with 0 and successively appending the Boolean complement of the sequence obtained thus ...
is cube-free and overlap-free; hence it avoids the patterns ''xxx'' and ''xyxyx''. *A square-free word is one avoiding the pattern ''xx''. The word over the alphabet \ obtained by taking the
first difference In mathematics, a recurrence relation is an equation according to which the nth term of a sequence of numbers is equal to some combination of the previous terms. Often, only k previous terms of the sequence appear in the equation, for a paramete ...
of the Thue–Morse sequence is an example of an infinite square-free word. * The patterns ''x'' and ''xyx'' are unavoidable on any alphabet, since they are factors of the Zimin words. * The power patterns ''x^n'' for ''n\geq 3 '' are 2-avoidable. *All binary patterns can be divided into three categories: **\varepsilon,x,xyx are unavoidable. **xx,xxy,xyy,xxyx,xxyy,xyxx,xyxy,xyyx,xxyxx,xxyxy,xyxyy have avoidability index of 3. **others have avoidability index of 2. *abwbaxbcyaczca has avoidability index of 4, as well as other locked words. *abvacwbaxbcycdazdcd has avoidability index of 5. *The repetitive threshold RT(n) is the infimum of exponents k such that x^k is avoidable on an alphabet of size n. Also see Dejean's theorem.


Open problems

*Is there an avoidable pattern p such that the avoidability index of p is 6? *Given an arbitrarily pattern p, is there an algorithm to determine the avoidability index of p?


References

* * * * {{cite book , last=Pytheas Fogg , first=N. , editor1-first=Valérie , editor1-last = Berthé , editor1-link = Valérie Berthé, editor2-last = Ferenczi , editor2-first= Sébastien , editor3-last= Mauduit , editor3-first= Christian , editor4-last= Siegel , editor4-first= A. , title=Substitutions in dynamics, arithmetics and combinatorics , series=Lecture Notes in Mathematics , volume=1794 , location=Berlin , publisher=
Springer-Verlag Springer Science+Business Media, commonly known as Springer, is a German multinational publishing company of books, e-books and peer-reviewed journals in science, humanities, technical and medical (STM) publishing. Originally founded in 1842 in ...
, year=2002 , isbn=3-540-44141-7 , zbl=1014.11015 Semigroup theory Formal languages Combinatorics on words