In
combinatorial mathematics
Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many appl ...
and
extremal set theory, the Sauer–Shelah lemma states that every
family of sets
In set theory and related branches of mathematics, a collection F of subsets of a given set S is called a family of subsets of S, or a family of sets over S. More generally, a collection of any sets whatsoever is called a family of sets, set fam ...
with small
VC dimension
VC may refer to:
Military decorations
* Victoria Cross, a military decoration awarded by the United Kingdom and also by certain Commonwealth nations
** Victoria Cross for Australia
** Victoria Cross (Canada)
** Victoria Cross for New Zealand
* Vic ...
consists of a small number of sets. It is named after
Norbert Sauer and
Saharon Shelah
Saharon Shelah ( he, שהרן שלח; born July 3, 1945) is an Israeli mathematician. He is a professor of mathematics at the Hebrew University of Jerusalem and Rutgers University in New Jersey.
Biography
Shelah was born in Jerusalem on July 3, ...
, who published it independently of each other in 1972. The same result was also published slightly earlier and again independently, by
Vladimir Vapnik
Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machine ...
and
Alexey Chervonenkis
Alexey Yakovlevich Chervonenkis (russian: link=no, Алексей Яковлевич Червоненкис; 7 September 1938 – 22 September 2014) was a Soviet and Russian mathematician. Along with Vladimir Vapnik, he was one of the main develo ...
, after whom the VC dimension is named. In his paper containing the lemma, Shelah gives credit also to
Micha Perles
Micah (; ) is a given name.
Micah is the name of several people in the Hebrew Bible ( Old Testament), and means "Who is like God?" The name is sometimes found with theophoric extensions. Suffix theophory in '' Yah'' and in ''Yahweh'' results in ...
, and for this reason the lemma has also been called the Perles–Sauer–Shelah lemma.
[.]
Buzaglo et al. call this lemma "one of the most fundamental results on VC-dimension",
and it has applications in many areas. Sauer's motivation was in the
combinatorics
Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many appl ...
of set systems, while Shelah's was in
model theory
In mathematical logic, model theory is the study of the relationship between formal theories (a collection of sentences in a formal language expressing statements about a mathematical structure), and their models (those structures in which the s ...
and that of Vapnik and Chervonenkis was in
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
. It has also been applied in
discrete geometry
Discrete geometry and combinatorial geometry are branches of geometry that study combinatorial properties and constructive methods of discrete geometric objects. Most questions in discrete geometry involve finite or discrete sets of basic geome ...
[.] and
graph theory
In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conne ...
.
[.]
Definitions and statement
If
is a family of sets, and
is another set, then
is said to be
shattered
Shattered may refer to:
Books
* ''Shattered'' (Casey book), a 2010 non-fiction book: true-crime account of pregnant mother's murder
* ''Shattered'' (Francis novel), a 2000 novel by Dick Francis: glassblower seeks videotape following death of j ...
by
if every subset of
(including the
empty set
In mathematics, the empty set is the unique set having no elements; its size or cardinality (count of elements in a set) is zero. Some axiomatic set theories ensure that the empty set exists by including an axiom of empty set, while in other ...
and
itself) can be obtained as an intersection
between
and a set in the family. The
VC dimension
VC may refer to:
Military decorations
* Victoria Cross, a military decoration awarded by the United Kingdom and also by certain Commonwealth nations
** Victoria Cross for Australia
** Victoria Cross (Canada)
** Victoria Cross for New Zealand
* Vic ...
of
is the largest
cardinality
In mathematics, the cardinality of a set is a measure of the number of elements of the set. For example, the set A = \ contains 3 elements, and therefore A has a cardinality of 3. Beginning in the late 19th century, this concept was generalized ...
of a set shattered by
.
In terms of these definitions, the Sauer–Shelah lemma states that if
is a family of sets with
distinct elements such that
, then
shatters a set of size
. Equivalently, if the VC dimension of
is
then
can consist of at most
sets.
The bound of the lemma is tight: Let the family
be composed of all subsets of
with size less than
. Then the size of
is exactly
but it does not shatter any set of size
.
[.]
The number of shattered sets
A strengthening of the Sauer–Shelah lemma, due to , states that every finite set family
shatters at least
sets. This immediately implies the Sauer–Shelah lemma, because only
of the subsets of an
-item universe have cardinality less than
. Thus, when
, there are not enough small sets to be shattered, so one of the shattered sets must have cardinality at least
.
For a restricted type of shattered set, called an order-shattered set, the number of shattered sets always equals the cardinality of the set family.
Proof
Pajor's variant of the Sauer–Shelah lemma may be proved by
mathematical induction
Mathematical induction is a method for proving that a statement ''P''(''n'') is true for every natural number ''n'', that is, that the infinitely many cases ''P''(0), ''P''(1), ''P''(2), ''P''(3), ... all hold. Informal metaphors help ...
; the proof has variously been credited to
Noga Alon
Noga Alon ( he, נוגה אלון; born 17 February 1956) is an Israeli mathematician and a professor of mathematics at Princeton University noted for his contributions to combinatorics and theoretical computer science, having authored hundreds of ...
or to
Ron Aharoni
Ron Aharoni ( he, רון אהרוני ) (born 1952) is an Israeli mathematician, working in finite and infinite combinatorics. Aharoni is a professor at the Technion – Israel Institute of Technology, where he received his Ph.D. in mathematics in ...
and Ron Holzman.
[.]
Base: every family of only one set shatters the empty set.
Step: Assume the lemma is true for all families of size less than
and let
be a family of two or more sets. Let
be an element that belongs to some but not all of the sets in
. Split
into two subfamilies, of the sets that contain
and the sets that do not contain
.
By the induction assumption, these two subfamilies shatter two collections of sets whose sizes add to at least
.
None of these shattered sets contain
, since a set that contains
cannot be shattered by a family in which all sets contain
or all sets do not contain
.
Some of the shattered sets may be shattered by both subfamilies. When a set
is shattered by only one of the two subfamilies, it contributes one unit both to the number of shattered sets of the subfamily and to the number of shattered sets of
. When a set
is shattered by both subfamilies, both
and
are shattered by
, so
contributes two units to the number of shattered sets of the subfamilies and of
. Therefore, the number of shattered sets of
is at least equal to the number shattered by the two subfamilies of
, which is at least
.
A different proof of the Sauer–Shelah lemma in its original form, by
Péter Frankl and
János Pach
János Pach (born May 3, 1954) is a mathematician and computer scientist working in the fields of combinatorics and discrete and computational geometry.
Biography
Pach was born and grew up in Hungary. He comes from a noted academic family: his ...
, is based on
linear algebra
Linear algebra is the branch of mathematics concerning linear equations such as:
:a_1x_1+\cdots +a_nx_n=b,
linear maps such as:
:(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n,
and their representations in vector spaces and through matrices.
...
and the
inclusion–exclusion principle
In combinatorics, a branch of mathematics, the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union of two finite sets; symbolically expressed as
: , A \cup ...
.
Applications
The original application of the lemma, by Vapnik and Chervonenkis, was in showing that every probability distribution can be approximated (with respect to a family of events of a given VC dimension) by a finite set of sample points whose
cardinality
In mathematics, the cardinality of a set is a measure of the number of elements of the set. For example, the set A = \ contains 3 elements, and therefore A has a cardinality of 3. Beginning in the late 19th century, this concept was generalized ...
depends only on the VC dimension of the family of events. In this context, there are two important notions of approximation, both parameterized by a number ε: a set ''S'' of samples, and a probability distribution on ''S'', is said to be an ε-approximation of the original distribution if the probability of each event with respect to ''S'' differs from its original probability by at most ε. A set ''S'' of (unweighted) samples is said to be an
ε-net if every event with probability at least ε includes at least one point of ''S''. An ε-approximation must also be an ε-net but not necessarily vice versa.
Vapnik and Chervonenkis used the lemma to show that set systems of VC dimension ''d'' always have ε-approximations of cardinality
. Later authors including
[.] and similarly showed that there always exist ε-nets of cardinality
, and more precisely of cardinality at most
.
The main idea of the proof of the existence of small ε-nets is to choose a random sample ''x'' of cardinality
and a second independent random sample ''y'' of cardinality
, and to bound the probability that ''x'' is missed by some large event ''E'' by the probability that ''x'' is missed and simultaneously the intersection of ''y'' with ''E'' is larger than its median value. For any particular ''E'', the probability that ''x'' is missed while ''y'' is larger than its median is very small,
and the Sauer–Shelah lemma (applied to
) shows that only a small number of distinct events ''E'' need to be considered, so by the
union bound
In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individ ...
, with nonzero probability, ''x'' is an ε-net.
In turn, ε-nets and ε-approximations, and the likelihood that a random sample of large enough cardinality has these properties, have important applications in
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
, in the area of
probably approximately correct learning
In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. It was proposed in 1984 by Leslie Valiant.L. Valiant. A theory of the learnable.' Communications of the A ...
. In
computational geometry
Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems ar ...
, they have been applied to
range searching
In computer science, the range searching problem consists of processing a set ''S'' of objects, in order to determine which objects from ''S'' intersect with a query object, called the ''range''. For example, if ''S'' is a set of points correspond ...
,
derandomization, and
approximation algorithm
In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned solu ...
s.
[.]
use generalizations of the Sauer–Shelah lemma to prove results in
graph theory
In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conne ...
such as that the number of
strong orientation In graph theory, a strong orientation of an undirected graph is an assignment of a direction to each edge (an orientation) that makes it into a strongly connected graph.
Strong orientations have been applied to the design of one-way road networks. ...
s of a given graph is sandwiched between its numbers of
connected
Connected may refer to:
Film and television
* ''Connected'' (2008 film), a Hong Kong remake of the American movie ''Cellular''
* '' Connected: An Autoblogography About Love, Death & Technology'', a 2011 documentary film
* ''Connected'' (2015 TV ...
and
2-edge-connected subgraphs.
See also
*
Growth function The growth function, also called the shatter coefficient or the shattering number, measures the richness of a set family. It is especially used in the context of statistical learning theory, where it measures the complexity of a hypothesis class.
Th ...
References
{{DEFAULTSORT:Sauer-Shelah lemma
Families of sets
Lemmas