HOME

TheInfoList



OR:

In
combinatorial mathematics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many appl ...
and extremal set theory, the Sauer–Shelah lemma states that every
family of sets In set theory and related branches of mathematics, a collection F of subsets of a given set S is called a family of subsets of S, or a family of sets over S. More generally, a collection of any sets whatsoever is called a family of sets, set fam ...
with small
VC dimension VC may refer to: Military decorations * Victoria Cross, a military decoration awarded by the United Kingdom and also by certain Commonwealth nations ** Victoria Cross for Australia ** Victoria Cross (Canada) ** Victoria Cross for New Zealand * Vic ...
consists of a small number of sets. It is named after Norbert Sauer and
Saharon Shelah Saharon Shelah ( he, שהרן שלח; born July 3, 1945) is an Israeli mathematician. He is a professor of mathematics at the Hebrew University of Jerusalem and Rutgers University in New Jersey. Biography Shelah was born in Jerusalem on July 3, ...
, who published it independently of each other in 1972. The same result was also published slightly earlier and again independently, by
Vladimir Vapnik Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machine ...
and
Alexey Chervonenkis Alexey Yakovlevich Chervonenkis (russian: link=no, Алексей Яковлевич Червоненкис; 7 September 1938 – 22 September 2014) was a Soviet and Russian mathematician. Along with Vladimir Vapnik, he was one of the main develo ...
, after whom the VC dimension is named. In his paper containing the lemma, Shelah gives credit also to
Micha Perles Micah (; ) is a given name. Micah is the name of several people in the Hebrew Bible ( Old Testament), and means "Who is like God?" The name is sometimes found with theophoric extensions. Suffix theophory in '' Yah'' and in ''Yahweh'' results in ...
, and for this reason the lemma has also been called the Perles–Sauer–Shelah lemma.. Buzaglo et al. call this lemma "one of the most fundamental results on VC-dimension", and it has applications in many areas. Sauer's motivation was in the
combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many appl ...
of set systems, while Shelah's was in
model theory In mathematical logic, model theory is the study of the relationship between formal theories (a collection of sentences in a formal language expressing statements about a mathematical structure), and their models (those structures in which the s ...
and that of Vapnik and Chervonenkis was in
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
. It has also been applied in
discrete geometry Discrete geometry and combinatorial geometry are branches of geometry that study combinatorial properties and constructive methods of discrete geometric objects. Most questions in discrete geometry involve finite or discrete sets of basic geome ...
. and
graph theory In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conne ...
..


Definitions and statement

If \textstyle \mathcal=\ is a family of sets, and T is another set, then T is said to be
shattered Shattered may refer to: Books * ''Shattered'' (Casey book), a 2010 non-fiction book: true-crime account of pregnant mother's murder * ''Shattered'' (Francis novel), a 2000 novel by Dick Francis: glassblower seeks videotape following death of j ...
by \mathcal if every subset of T (including the
empty set In mathematics, the empty set is the unique set having no elements; its size or cardinality (count of elements in a set) is zero. Some axiomatic set theories ensure that the empty set exists by including an axiom of empty set, while in other ...
and T itself) can be obtained as an intersection T\cap S_i between T and a set in the family. The
VC dimension VC may refer to: Military decorations * Victoria Cross, a military decoration awarded by the United Kingdom and also by certain Commonwealth nations ** Victoria Cross for Australia ** Victoria Cross (Canada) ** Victoria Cross for New Zealand * Vic ...
of \mathcal is the largest
cardinality In mathematics, the cardinality of a set is a measure of the number of elements of the set. For example, the set A = \ contains 3 elements, and therefore A has a cardinality of 3. Beginning in the late 19th century, this concept was generalized ...
of a set shattered by \mathcal. In terms of these definitions, the Sauer–Shelah lemma states that if \mathcal is a family of sets with n distinct elements such that \textstyle , \mathcal, > \sum_^ , then \mathcal shatters a set of size k. Equivalently, if the VC dimension of \mathcal is k, then \mathcal can consist of at most \textstyle \sum_^ =O(n^k) sets. The bound of the lemma is tight: Let the family \mathcal be composed of all subsets of \ with size less than k. Then the size of \mathcal is exactly \textstyle \sum_^ but it does not shatter any set of size k..


The number of shattered sets

A strengthening of the Sauer–Shelah lemma, due to , states that every finite set family \mathcal shatters at least , \mathcal, sets. This immediately implies the Sauer–Shelah lemma, because only \sum_^ of the subsets of an n-item universe have cardinality less than k. Thus, when , \mathcal, >\sum_^ , there are not enough small sets to be shattered, so one of the shattered sets must have cardinality at least k. For a restricted type of shattered set, called an order-shattered set, the number of shattered sets always equals the cardinality of the set family.


Proof

Pajor's variant of the Sauer–Shelah lemma may be proved by
mathematical induction Mathematical induction is a method for proving that a statement ''P''(''n'') is true for every natural number ''n'', that is, that the infinitely many cases ''P''(0), ''P''(1), ''P''(2), ''P''(3), ...  all hold. Informal metaphors help ...
; the proof has variously been credited to
Noga Alon Noga Alon ( he, נוגה אלון; born 17 February 1956) is an Israeli mathematician and a professor of mathematics at Princeton University noted for his contributions to combinatorics and theoretical computer science, having authored hundreds of ...
or to
Ron Aharoni Ron Aharoni ( he, רון אהרוני ) (born 1952) is an Israeli mathematician, working in finite and infinite combinatorics. Aharoni is a professor at the Technion – Israel Institute of Technology, where he received his Ph.D. in mathematics in ...
and Ron Holzman.. Base: every family of only one set shatters the empty set. Step: Assume the lemma is true for all families of size less than , \mathcal, and let \mathcal be a family of two or more sets. Let x be an element that belongs to some but not all of the sets in \mathcal. Split \mathcal into two subfamilies, of the sets that contain x and the sets that do not contain x. By the induction assumption, these two subfamilies shatter two collections of sets whose sizes add to at least , \mathcal, . None of these shattered sets contain x, since a set that contains x cannot be shattered by a family in which all sets contain x or all sets do not contain x. Some of the shattered sets may be shattered by both subfamilies. When a set S is shattered by only one of the two subfamilies, it contributes one unit both to the number of shattered sets of the subfamily and to the number of shattered sets of \mathcal. When a set S is shattered by both subfamilies, both S and S\cup\ are shattered by \mathcal, so S contributes two units to the number of shattered sets of the subfamilies and of \mathcal. Therefore, the number of shattered sets of \mathcal is at least equal to the number shattered by the two subfamilies of \mathcal, which is at least , \mathcal, . A different proof of the Sauer–Shelah lemma in its original form, by Péter Frankl and
János Pach János Pach (born May 3, 1954) is a mathematician and computer scientist working in the fields of combinatorics and discrete and computational geometry. Biography Pach was born and grew up in Hungary. He comes from a noted academic family: his ...
, is based on
linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrices. ...
and the
inclusion–exclusion principle In combinatorics, a branch of mathematics, the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union of two finite sets; symbolically expressed as : , A \cup ...
.


Applications

The original application of the lemma, by Vapnik and Chervonenkis, was in showing that every probability distribution can be approximated (with respect to a family of events of a given VC dimension) by a finite set of sample points whose
cardinality In mathematics, the cardinality of a set is a measure of the number of elements of the set. For example, the set A = \ contains 3 elements, and therefore A has a cardinality of 3. Beginning in the late 19th century, this concept was generalized ...
depends only on the VC dimension of the family of events. In this context, there are two important notions of approximation, both parameterized by a number ε: a set ''S'' of samples, and a probability distribution on ''S'', is said to be an ε-approximation of the original distribution if the probability of each event with respect to ''S'' differs from its original probability by at most ε. A set ''S'' of (unweighted) samples is said to be an ε-net if every event with probability at least ε includes at least one point of ''S''. An ε-approximation must also be an ε-net but not necessarily vice versa. Vapnik and Chervonenkis used the lemma to show that set systems of VC dimension ''d'' always have ε-approximations of cardinality O(\tfrac\log\tfrac). Later authors including . and similarly showed that there always exist ε-nets of cardinality O(\tfrac\log\tfrac), and more precisely of cardinality at most \tfrac\ln\tfrac+\tfrac\ln\ln\tfrac+\tfrac. The main idea of the proof of the existence of small ε-nets is to choose a random sample ''x'' of cardinality O(\tfrac\log\tfrac) and a second independent random sample ''y'' of cardinality O(\tfrac\log^2\tfrac), and to bound the probability that ''x'' is missed by some large event ''E'' by the probability that ''x'' is missed and simultaneously the intersection of ''y'' with ''E'' is larger than its median value. For any particular ''E'', the probability that ''x'' is missed while ''y'' is larger than its median is very small, and the Sauer–Shelah lemma (applied to x\cup y) shows that only a small number of distinct events ''E'' need to be considered, so by the
union bound In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individ ...
, with nonzero probability, ''x'' is an ε-net. In turn, ε-nets and ε-approximations, and the likelihood that a random sample of large enough cardinality has these properties, have important applications in
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
, in the area of
probably approximately correct learning In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. It was proposed in 1984 by Leslie Valiant.L. Valiant. A theory of the learnable.' Communications of the A ...
. In
computational geometry Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems ar ...
, they have been applied to
range searching In computer science, the range searching problem consists of processing a set ''S'' of objects, in order to determine which objects from ''S'' intersect with a query object, called the ''range''. For example, if ''S'' is a set of points correspond ...
, derandomization, and
approximation algorithm In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned solu ...
s.. use generalizations of the Sauer–Shelah lemma to prove results in
graph theory In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conne ...
such as that the number of
strong orientation In graph theory, a strong orientation of an undirected graph is an assignment of a direction to each edge (an orientation) that makes it into a strongly connected graph. Strong orientations have been applied to the design of one-way road networks. ...
s of a given graph is sandwiched between its numbers of
connected Connected may refer to: Film and television * ''Connected'' (2008 film), a Hong Kong remake of the American movie ''Cellular'' * '' Connected: An Autoblogography About Love, Death & Technology'', a 2011 documentary film * ''Connected'' (2015 TV ...
and 2-edge-connected subgraphs.


See also

*
Growth function The growth function, also called the shatter coefficient or the shattering number, measures the richness of a set family. It is especially used in the context of statistical learning theory, where it measures the complexity of a hypothesis class. Th ...


References

{{DEFAULTSORT:Sauer-Shelah lemma Families of sets Lemmas