Analytic combinatorics
   HOME

TheInfoList



OR:

In
combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many a ...
, the symbolic method is a technique for counting combinatorial objects. It uses the internal structure of the objects to derive formulas for their
generating function In mathematics, a generating function is a way of encoding an infinite sequence of numbers () by treating them as the coefficients of a formal power series. This series is called the generating function of the sequence. Unlike an ordinary serie ...
s. The method is mostly associated with Philippe Flajolet and is detailed in Part A of his book with Robert Sedgewick, '' Analytic Combinatorics'', while the rest of the book explains how to use complex analysis in order to get asymptotic and probabilistic results on the corresponding generating functions. During two centuries, generating functions were popping up via the corresponding recurrences on their coefficients (as can be seen in the seminal works of Bernoulli,
Euler Leonhard Euler ( , ; 15 April 170718 September 1783) was a Swiss mathematician, physicist, astronomer, geographer, logician and engineer who founded the studies of graph theory and topology and made pioneering and influential discoveries in ...
,
Arthur Cayley Arthur Cayley (; 16 August 1821 – 26 January 1895) was a prolific British mathematician who worked mostly on algebra. He helped found the modern British school of pure mathematics. As a child, Cayley enjoyed solving complex maths problem ...
, Schröder, Ramanujan, Riordan, Knuth, , etc.). It was then slowly realized that the generating functions were capturing many other facets of the initial discrete combinatorial objects, and that this could be done in a more direct formal way: The recursive nature of some combinatorial structures translates, via some isomorphisms, into noteworthy identities on the corresponding generating functions. Following the works of Pólya, further advances were thus done in this spirit in the 1970s with generic uses of languages for specifying combinatorial classes and their generating functions, as found in works by Foata and Schützenberger on permutations, Bender and Goldman on prefabs, and Joyal on combinatorial species. Note that this symbolic method in enumeration is unrelated to "Blissard's symbolic method", which is just another old name for
umbral calculus In mathematics before the 1970s, the term umbral calculus referred to the surprising similarity between seemingly unrelated polynomial equations and certain "shadowy" techniques used to "prove" them. These techniques were introduced by John Blis ...
. The symbolic method in combinatorics constitutes the first step of many analyses of combinatorial structures, which can then lead to fast computation schemes, to asymptotic properties and limit laws, to random generation, all of them being suitable to automatization via
computer algebra In mathematics and computer science, computer algebra, also called symbolic computation or algebraic computation, is a scientific area that refers to the study and development of algorithms and software for manipulating mathematical expressions ...
.


Classes of combinatorial structures

Consider the problem of distributing objects given by a generating function into a set of ''n'' slots, where a permutation group ''G'' of degree ''n'' acts on the slots to create an equivalence relation of filled slot configurations, and asking about the generating function of the configurations by weight of the configurations with respect to this equivalence relation, where the weight of a configuration is the sum of the weights of the objects in the slots. We will first explain how to solve this problem in the labelled and the unlabelled case and use the solution to motivate the creation of classes of combinatorial structures. The
Pólya enumeration theorem The Pólya enumeration theorem, also known as the Redfield–Pólya theorem and Pólya counting, is a theorem in combinatorics that both follows from and ultimately generalizes Burnside's lemma on the number of orbits of a group action on a set. ...
solves this problem in the unlabelled case. Let ''f''(''z'') be the
ordinary generating function In mathematics, a generating function is a way of encoding an infinite sequence of numbers () by treating them as the coefficients of a formal power series. This series is called the generating function of the sequence. Unlike an ordinary ser ...
(OGF) of the objects, then the OGF of the configurations is given by the substituted
cycle index Cycle, cycles, or cyclic may refer to: Anthropology and social sciences * Cyclic history, a theory of history * Cyclical theory, a theory of American political history associated with Arthur Schlesinger, Sr. * Social cycle, various cycles in soc ...
:Z(G)(f(z), f(z^2), \ldots, f(z^n)). In the labelled case we use an exponential generating function (EGF) ''g''(''z'') of the objects and apply the Labelled enumeration theorem, which says that the EGF of the configurations is given by :\frac. We are able to enumerate filled slot configurations using either PET in the unlabelled case or the labelled enumeration theorem in the labelled case. We now ask about the generating function of configurations obtained when there is more than one set of slots, with a permutation group acting on each. Clearly the orbits do not intersect and we may add the respective generating functions. Suppose, for example, that we want to enumerate unlabelled sequences of length two or three of some objects contained in a set ''X''. There are two sets of slots, the first one containing two slots, and the second one, three slots. The group acting on the first set is E_2, and on the second slot, E_3. We represent this by the following formal power series in ''X'': : X^2/E_2 \; + \; X^3/E_3 where the term X^n/G is used to denote the set of orbits under ''G'' and X^n = X \times \cdots \times X, which denotes in the obvious way the process of distributing the objects from ''X'' with repetition into the ''n'' slots. Similarly, consider the labelled problem of creating cycles of arbitrary length from a set of labelled objects ''X''. This yields the following series of actions of cyclic groups: : X/C_1 \; + \; X^2/C_2 \; + \; X^3/C_3 \; + \; X^4/C_4 \; + \cdots. Clearly we can assign meaning to any such power series of quotients (orbits) with respect to permutation groups, where we restrict the groups of degree ''n'' to the conjugacy classes \operatorname(S_n) of the symmetric group S_n, which form a unique factorization domain. (The orbits with respect to two groups from the same conjugacy class are isomorphic.) This motivates the following definition. A class \mathcal\in \mathbb mathfrak/math> of combinatorial structures is a formal series :\mathcal = \sum_ \sum_ c_G (X^n/G) where \mathfrak (the "A" is for "atoms") is the set of primes of the UFD \_ and c_G \in \mathbb. In the following we will simplify our notation a bit and write e.g. : E_2 + E_3 \text C_1 + C_2 + C_3 + \cdots. for the classes mentioned above.


The Flajolet–Sedgewick fundamental theorem

A theorem in the Flajolet–Sedgewick theory of symbolic combinatorics treats the enumeration problem of labelled and unlabelled combinatorial classes by means of the creation of symbolic operators that make it possible to translate equations involving combinatorial structures directly (and automatically) into equations in the generating functions of these structures. Let \mathcal\in\mathbb mathfrak/math> be a class of combinatorial structures. The OGF F(z) of \mathcal(X) where ''X'' has OGF f(z) and the EGF G(z) of \mathcal(X) where ''X'' is labelled with EGF g(z) are given by :F(z) = \sum_ \sum_ c_G Z(G)(f(z), f(z^2), \ldots, f(z^n)) and :G(z) = \sum_ \left(\sum_ \frac\right) g(z)^n. In the labelled case we have the additional requirement that ''X'' not contain elements of size zero. It will sometimes prove convenient to add one to G(z) to indicate the presence of one copy of the empty set. It is possible to assign meaning to both \mathcal\in\mathbb mathfrak/math> (the most common example is the case of unlabelled sets) and \mathcal\in\mathbb mathfrak To prove the theorem simply apply PET (Pólya enumeration theorem) and the labelled enumeration theorem. The power of this theorem lies in the fact that it makes it possible to construct operators on generating functions that represent combinatorial classes. A structural equation between combinatorial classes thus translates directly into an equation in the corresponding generating functions. Moreover, in the labelled case it is evident from the formula that we may replace g(z) by the atom ''z'' and compute the resulting operator, which may then be applied to EGFs. We now proceed to construct the most important operators. The reader may wish to compare with the data on the
cycle index Cycle, cycles, or cyclic may refer to: Anthropology and social sciences * Cyclic history, a theory of history * Cyclical theory, a theory of American political history associated with Arthur Schlesinger, Sr. * Social cycle, various cycles in soc ...
page.


The sequence operator

This operator corresponds to the class :1 + E_1 + E_2 + E_3 + \cdots and represents sequences, i.e. the slots are not being permuted and there is exactly one empty sequence. We have : F(z) = 1 + \sum_ Z(E_n)(f(z), f(z^2), \ldots, f(z^n)) = 1 + \sum_ f(z)^n = \frac and : G(z) = 1 + \sum_ \left(\frac\right) g(z)^n = \frac.


The cycle operator

This operator corresponds to the class :C_1 + C_2 + C_3 + \cdots i.e., cycles containing at least one object. We have : F(z) = \sum_ Z(C_n)(f(z), f(z^2), \ldots, f(z^n)) = \sum_ \frac \sum_ \varphi(d) f(z^d)^ or : F(z) = \sum_ \varphi(k) \sum_ \frac f(z^k)^m = \sum_ \frac \log \frac and : G(z) = \sum_ \left(\frac\right) g(z)^n = \log \frac. This operator, together with the set operator , and their restrictions to specific degrees are used to compute random permutation statistics. There are two useful restrictions of this operator, namely to even and odd cycles. The labelled even cycle operator is :C_2 + C_4 + C_6 + \cdots which yields : G(z) = \sum_ \left(\frac\right) g(z)^ = \frac \log \frac. This implies that the labelled odd cycle operator :C_1 + C_3 + C_5 + \cdots is given by : G(z) = \log \frac - \frac \log \frac = \frac \log \frac.


The multiset/set operator

The series is :1 + S_1 + S_2 + S_3 + \cdots i.e., the symmetric group is applied to the slots. This creates multisets in the unlabelled case and sets in the labelled case (there are no multisets in the labelled case because the labels distinguish multiple instances of the same object from the set being put into different slots). We include the empty set in both the labelled and the unlabelled case. The unlabelled case is done using the function :M(f(z), y) = \sum_ y^n Z(S_n)(f(z), f(z^2), \ldots, f(z^n)) so that :\mathfrak(f(z)) = M(f(z), 1). Evaluating M(f(z), 1) we obtain : F(z) = \exp \left( \sum_ \frac \right). For the labelled case we have : G(z) = 1 + \sum_ \left(\frac\right) g(z)^n = \sum_ \frac = \exp g(z). In the labelled case we denote the operator by , and in the unlabelled case, by . This is because in the labeled case there are no multisets (the labels distinguish the constituents of a compound combinatorial class) whereas in the unlabeled case there are multisets and sets, with the latter being given by : F(z) = \exp \left( \sum_ (-1)^ \frac \ell \right).


Procedure

Typically, one starts with the ''neutral class'' \mathcal, containing a single object of size 0 (the ''neutral object'', often denoted by \epsilon), and one or more ''atomic classes'' \mathcal, each containing a single object of size 1. Next,
set-theoretic Set theory is the branch of mathematical logic that studies sets, which can be informally described as collections of objects. Although objects of any kind can be collected into a set, set theory, as a branch of mathematics, is mostly concern ...
relations involving various simple operations, such as
disjoint union In mathematics, a disjoint union (or discriminated union) of a family of sets (A_i : i\in I) is a set A, often denoted by \bigsqcup_ A_i, with an injection of each A_i into A, such that the images of these injections form a partition of A ( ...
s,
products Product may refer to: Business * Product (business), an item that serves as a solution to a specific consumer problem. * Product (project management), a deliverable or set of deliverables that contribute to a business solution Mathematics * Produ ...
, sets,
sequence In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called ...
s, and
multiset In mathematics, a multiset (or bag, or mset) is a modification of the concept of a set that, unlike a set, allows for multiple instances for each of its elements. The number of instances given for each element is called the multiplicity of that e ...
s define more complex classes in terms of the already defined classes. These relations may be recursive. The elegance of symbolic combinatorics lies in that the set theoretic, or ''symbolic'', relations translate directly into ''
algebra Algebra () is one of the broad areas of mathematics. Roughly speaking, algebra is the study of mathematical symbols and the rules for manipulating these symbols in formulas; it is a unifying thread of almost all of mathematics. Elementary ...
ic'' relations involving the generating functions. In this article, we will follow the convention of using script uppercase letters to denote combinatorial classes and the corresponding plain letters for the generating functions (so the class \mathcal has generating function A(z)). There are two types of generating functions commonly used in symbolic combinatorics—
ordinary generating function In mathematics, a generating function is a way of encoding an infinite sequence of numbers () by treating them as the coefficients of a formal power series. This series is called the generating function of the sequence. Unlike an ordinary ser ...
s, used for combinatorial classes of unlabelled objects, and exponential generating functions, used for classes of labelled objects. It is trivial to show that the generating functions (either ordinary or exponential) for \mathcal and \mathcal are E(z) = 1 and Z(z) = z, respectively. The disjoint union is also simple — for disjoint sets \mathcal and \mathcal, \mathcal = \mathcal \cup \mathcal implies A(z) = B(z) + C(z). The relations corresponding to other operations depend on whether we are talking about labelled or unlabelled structures (and ordinary or exponential generating functions).


Combinatorial sum

The restriction of unions to disjoint unions is an important one; however, in the formal specification of symbolic combinatorics, it is too much trouble to keep track of which sets are disjoint. Instead, we make use of a construction that guarantees there is no intersection (''be careful, however; this affects the semantics of the operation as well''). In defining the ''combinatorial sum'' of two sets \mathcal and \mathcal, we mark members of each set with a distinct marker, for example \circ for members of \mathcal and \bullet for members of \mathcal. The combinatorial sum is then: :\mathcal + \mathcal = (\mathcal \times \) \cup (\mathcal \times \) This is the operation that formally corresponds to addition.


Unlabelled structures

With unlabelled structures, an
ordinary generating function In mathematics, a generating function is a way of encoding an infinite sequence of numbers () by treating them as the coefficients of a formal power series. This series is called the generating function of the sequence. Unlike an ordinary ser ...
(OGF) is used. The OGF of a sequence A_n is defined as :A(x)=\sum_^\infty A_n x^n


Product

The product of two combinatorial classes \mathcal and \mathcal is specified by defining the size of an ordered pair as the sum of the sizes of the elements in the pair. Thus we have for a \in \mathcal and b \in \mathcal, , (a,b), = , a, + , b, . This should be a fairly intuitive definition. We now note that the number of elements in \mathcal \times \mathcal of size n is :\sum_^n A_k B_. Using the definition of the OGF and some elementary algebra, we can show that :\mathcal = \mathcal \times \mathcal implies A(z) = B(z) \cdot C(z).


Sequence

The ''sequence construction'', denoted by \mathcal = \mathfrak\ is defined as :\mathfrak\ = \mathcal + \mathcal + (\mathcal \times \mathcal) + (\mathcal \times \mathcal \times \mathcal) + \cdots. In other words, a sequence is the neutral element, or an element of \mathcal, or an ordered pair, ordered triple, etc. This leads to the relation :A(z) = 1 + B(z) + B(z)^ + B(z)^ + \cdots = \frac.


Set

The ''set'' (or ''powerset'') ''construction'', denoted by \mathcal = \mathfrak\ is defined as :\mathfrak\ = \prod_(\mathcal + \), which leads to the relation :\beginA(z) & = \prod_(1 + z^) \\ & = \prod_^\infty (1 + z^n)^ \\ & = \exp \left ( \ln \prod_^(1 + z^n)^ \right ) \\ & = \exp \left ( \sum_^\infty B_n \ln(1 + z^n) \right ) \\ & = \exp \left ( \sum_^\infty B_n \cdot \sum_^ \frac k \right ) \\ & = \exp \left ( \sum_^\infty \frac k \cdot \sum_^\infty B_n z^ \right ) \\ & = \exp \left ( \sum_^\infty \frac k \right), \end where the expansion :\ln(1 + u) = \sum_^\infty \frac k was used to go from line 4 to line 5.


Multiset

The ''multiset construction'', denoted \mathcal = \mathfrak\ is a generalization of the set construction. In the set construction, each element can occur zero or one times. In a multiset, each element can appear an arbitrary number of times. Therefore, :\mathfrak\ = \prod_ \mathfrak\. This leads to the relation :\begin A(z) & = \prod_ (1 - z^)^ \\ & = \prod_^\infty (1 - z^n)^ \\ & = \exp \left ( \ln \prod_^\infty (1 - z^n)^ \right ) \\ & = \exp \left ( \sum_^\infty-B_n \ln (1 - z^n) \right ) \\ & = \exp \left ( \sum_^\infty \frac k \right ), \end where, similar to the above set construction, we expand \ln (1 - z^n), swap the sums, and substitute for the OGF of \mathcal.


Other elementary constructions

Other important elementary constructions are: *the ''cycle construction'' (\mathfrak\), like sequences except that cyclic rotations are not considered distinct *''pointing'' (\Theta\mathcal), in which each member of is augmented by a neutral (zero size) pointer to one of its atoms *''substitution'' (\mathcal \circ \mathcal), in which each atom in a member of is replaced by a member of . The derivations for these constructions are too complicated to show here. Here are the results:


Examples

Many combinatorial classes can be built using these elementary constructions. For example, the class of plane
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
s (that is, trees embedded in the plane, so that the order of the subtrees matters) is specified by the recursive relation :\mathcal = \mathcal \times \operatorname\. In other words, a tree is a root node of size 1 and a sequence of subtrees. This gives :G(z) = \frac we solve for ''G''(''z'') by multiplying 1 - G(z) to get G(z) - G(z)^2 = z subtracting z and solving for G(z) using the quadratic formula gives :G(z) = \frac. Another example (and a classic combinatorics problem) is
integer partition In number theory and combinatorics, a partition of a positive integer , also called an integer partition, is a way of writing as a sum of positive integers. Two sums that differ only in the order of their summands are considered the same part ...
s. First, define the class of positive integers \mathcal, where the size of each integer is its value: :\mathcal = \mathcal \times \operatorname\ The OGF of \mathcal is then :I(z) = \frac. Now, define the set of partitions \mathcal as :\mathcal = \operatorname\. The OGF of \mathcal is :P(z) = \exp \left ( I(z) + \frac I(z^) + \frac I(z^) + \cdots \right ). Unfortunately, there is no closed form for P(z); however, the OGF can be used to derive a
recurrence relation In mathematics, a recurrence relation is an equation according to which the nth term of a sequence of numbers is equal to some combination of the previous terms. Often, only k previous terms of the sequence appear in the equation, for a parameter ...
, or using more advanced methods of analytic combinatorics, calculate the asymptotic behavior of the counting sequence.


Specification and specifiable classes

The elementary constructions mentioned above allow to define the notion of ''specification''. Those specification allow to use a set of recursive equations, with multiple combinatorial classes. Formally, a specification for a set of combinatorial classes (\mathcal A_1,\dots,\mathcal A_r) is a set of r equations \mathcal A_i=\Phi_i(\mathcal A_1,\dots,\mathcal A_r), where \Phi_i is an expression, whose atoms are \mathcal E,\mathcal Z and the \mathcal A_i's, and whose operators are the elementary constructions listed above. A class of combinatorial structures is said to be ''constructible'' or ''specifiable'' when it admits a specification. For example, the set of trees whose leaves' depth is even (respectively, odd) can be defined using the specification with two classes \mathcal A_\text and \mathcal A_\text. Those classes should satisfy the equation \mathcal A_\text=\mathcal Z\times \operatorname_\mathcal A_\text and \mathcal A_\text = \mathcal Z\times \operatorname\mathcal A_\text.


Labelled structures

An object is ''weakly labelled'' if each of its atoms has a nonnegative integer label, and each of these labels is distinct. An object is (''strongly'' or ''well'') ''labelled'', if furthermore, these labels comprise the consecutive integers \ldots n/math>. ''Note: some combinatorial classes are best specified as labelled structures or unlabelled structures, but some readily admit both specifications.'' A good example of labelled structures is the class of labelled graphs. With labelled structures, an exponential generating function (EGF) is used. The EGF of a sequence A_n is defined as :A(x)=\sum_^\infty A_n \frac.


Product

For labelled structures, we must use a different definition for product than for unlabelled structures. In fact, if we simply used the cartesian product, the resulting structures would not even be well labelled. Instead, we use the so-called ''labelled product'', denoted \mathcal \star \mathcal. For a pair \beta \in \mathcal and \gamma \in \mathcal, we wish to combine the two structures into a single structure. In order for the result to be well labelled, this requires some relabelling of the atoms in \beta and \gamma. We will restrict our attention to relabellings that are consistent with the order of the original labels. Note that there are still multiple ways to do the relabelling; thus, each pair of members determines not a single member in the product, but a set of new members. The details of this construction are found on the page of the Labelled enumeration theorem. To aid this development, let us define a function, \rho, that takes as its argument a (possibly weakly) labelled object \alpha and relabels its atoms in an order-consistent way so that \rho(\alpha) is well labelled. We then define the labelled product for two objects \alpha and \beta as :\alpha \star \beta = \. Finally, the labelled product of two classes \mathcal and \mathcal is :\mathcal \star \mathcal = \bigcup_ (\alpha \star \beta). The EGF can be derived by noting that for objects of size k and n-k, there are ways to do the relabelling. Therefore, the total number of objects of size n is :\sum_^n A_k B_. This ''binomial convolution'' relation for the terms is equivalent to multiplying the EGFs, :A(z) \cdot B(z).


Sequence

The ''sequence construction'' \mathcal = \mathfrak\ is defined similarly to the unlabelled case: :\mathfrak\ = \mathcal + \mathcal + (\mathcal \star \mathcal) + (\mathcal \star \mathcal \star \mathcal) + \cdots and again, as above, :A(z) = \frac


Set

In labelled structures, a set of k elements corresponds to exactly k! sequences. This is different from the unlabelled case, where some of the permutations may coincide. Thus for \mathcal = \mathfrak\, we have :A(z) = \sum_^\infty \frac = \exp(B(z))


Cycle

Cycles are also easier than in the unlabelled case. A cycle of length k corresponds to k distinct sequences. Thus for \mathcal = \mathfrak\, we have :A(z) = \sum_^\infty \frac = \ln\left(\frac 1 \right).


Boxed product

In labelled structures, the min-boxed product \mathcal_ = \mathcal^\star \mathcal is a variation of the original product which requires the element of \mathcal in the product with the minimal label. Similarly, we can also define a max-boxed product, denoted by \mathcal_ = \mathcal^\blacksquare \star \mathcal, by the same manner. Then we have, :A_(z)=A_(z)=\int^z_0 B'(t)C(t)\,dt. or equivalently, :A_'(t)=A_'(t)=B'(t)C(t).


Example

An increasing Cayley tree is a labelled non-plane and rooted tree whose labels along any branch stemming from the root form an increasing sequence. Then, let \mathcal be the class of such trees. The recursive specification is now \mathcal=\mathcal^\square\star \operatorname(\mathcal).


Other elementary constructions

The operators and represent cycles of even and odd length, and sets of even and odd cardinality.


Example

Stirling numbers of the second kind In mathematics, particularly in combinatorics, a Stirling number of the second kind (or Stirling partition number) is the number of ways to partition a set of ''n'' objects into ''k'' non-empty subsets and is denoted by S(n,k) or \textstyle \lef ...
may be derived and analyzed using the structural decomposition : \operatorname(\operatorname_(\mathcal)). The decomposition : \operatorname(\operatorname(\mathcal)) is used to study unsigned Stirling numbers of the first kind, and in the derivation of the statistics of random permutations. A detailed examination of the exponential generating functions associated to Stirling numbers within symbolic combinatorics may be found on the page on Stirling numbers and exponential generating functions in symbolic combinatorics.


See also

* Combinatorial species


References

{{reflist * François Bergeron, Gilbert Labelle, Pierre Leroux, ''Théorie des espèces et combinatoire des structures arborescentes'', LaCIM, Montréal (1994). English version: ''Combinatorial Species and Tree-like Structures'', Cambridge University Press (1998). * Philippe Flajolet and Robert Sedgewick, '' Analytic Combinatorics'', Cambridge University Press (2009). (available online: http://algo.inria.fr/flajolet/Publications/book.pdf) * Micha Hofri, ''Analysis of Algorithms: Computational Methods and Mathematical Tools'', Oxford University Press (1995). Combinatorics