Vapnik–Chervonenkis theory Vapnik–Chervonenkis theory (also known as VC theory) was developed during 1960–1990 by Vladimir Vapnik and Alexey Chervonenkis. The theory is a form of computational learning theory, which attempts to explain the learning process from a stat ...

, the Vapnik–Chervonenkis (VC) dimension is a measure of the size (capacity, complexity, expressive power, richness, or flexibility) of a class of sets. The notion can be extended to classes of binary functions. It is defined as the

cardinality The thumb is the first digit of the hand, next to the index finger. When a person is standing in the medical anatomical position (where the palm is facing to the front), the thumb is the outermost digit. The Medical Latin English noun for thum ...

of the largest set of points that the algorithm can shatter, which means the algorithm can always learn a perfect classifier for any labeling of at least one configuration of those data points. It was originally defined by Vladimir Vapnik and Alexey Chervonenkis. Informally, the capacity of a classification model is related to how complicated it can be. For example, consider the thresholding of a high- degree

polynomial In mathematics, a polynomial is a Expression (mathematics), mathematical expression consisting of indeterminate (variable), indeterminates (also called variable (mathematics), variables) and coefficients, that involves only the operations of addit ...

: if the polynomial evaluates above zero, that point is classified as positive, otherwise as negative. A high-degree polynomial can be wiggly, so that it can fit a given set of training points well. But one can expect that the classifier will make errors on other points, because it is too wiggly. Such a polynomial has a high capacity. A much simpler alternative is to threshold a linear function. This function may not fit the training set well, because it has a low capacity. This notion of capacity is made rigorous below.

Definitions

VC dimension of a set-family

Let

\mathcal C = \_

be a

family of sets In set theory and related branches of mathematics, a family (or collection) can mean, depending upon the context, any of the following: set, indexed set, multiset, or class. A collection F of subsets of a given set S is called a family of su ...

(also called set family, collection of sets or set of sets) and

X

a set. Their ''intersection'' is defined as the following set family: :

\mathcal C\cap X := \.

Here typically

X

and each

C \in \mathcal C

are subsets of a big "universe" of possibilities

U

where intersection takes place. We say that a set

X

is '' shattered'' by

\mathcal C

\mathcal P(X) = \mathcal C\cap X

i.e. the set of intersections contains (hence is equal to) all the subsets of

X

. For finite sets

X

this is equivalent to :

, \mathcal C\cap X,  = 2^.

The ''VC dimension''

D

\mathcal C

is the

of the largest set that is shattered by

\mathcal C

. If arbitrarily large sets can be shattered, the VC dimension of

\mathcal C

\infty

VC dimension of a classification model

A binary classification model

f

with some parameter vector

\theta

is said to '' shatter'' a set of generally positioned data points

(x_1,x_2,\ldots,x_n)

if, for every assignment of labels to those points, there exists a

\theta

such that the model

f

makes no errors when evaluating that set of data points. The VC dimension of a model

f

is the maximum number of points that can be arranged so that

f

shatters them. More formally, it is the maximum cardinal

D

such that there exists a generally positioned data point set of

D

that can be shattered by

f

Examples

f

is a constant classifier (with no parameters); Its VC dimension is 0 since it cannot shatter even a single point. In general, the VC dimension of a finite classification model, which can return at most

2^d

different classifiers, is at most

d

(this is an upper bound on the VC dimension; the

Sauer–Shelah lemma In combinatorial mathematics and extremal set theory, the Sauer–Shelah lemma states that every family of sets with small VC dimension consists of a small number of sets. It is named after Norbert Sauer and Saharon Shelah, who published it inde ...

gives a lower bound on the dimension). #

f

is a single-parametric threshold classifier on real numbers; i.e., for a certain threshold

\theta

, the classifier

f_\theta

returns 1 if the input number is larger than

\theta

and 0 otherwise. The VC dimension of

f

is 1 because: (a) It can shatter a single point. For every point

x

, a classifier

f_\theta

labels it as 0 if

\theta>x

and labels it as 1 if

\theta . (b) It cannot shatter all the sets with two points. For every set of two numbers, if the smaller is labeled 1, then the larger must also be labeled 1, so not all labelings are possible.
# f is a single-parametric interval classifier on real numbers; i.e., for a certain parameter \theta, the classifier f_\theta returns 1 if the input number is in the interval theta,\theta+4 /math> and 0 otherwise. The VC dimension of f is 2 because: (a) It can shatter some sets of two points. E.g., for every set \, a classifier f_\theta labels it as (0,0) if \theta < x - 4 or if \theta > x + 2, as (1,0) if \theta\in -4,x-2), as (1,1) if \theta\in[x-2,x/math>, and as (0,1) if \theta\in(x,x+2">-2,x.html" ;"title="-4,x-2), as (1,1) if \theta\in[x-2,x">-4,x-2), as (1,1) if \theta\in[x-2,x/math>, and as (0,1) if \theta\in(x,x+2/math>. (b) It cannot shatter any set of three points. For every set of three numbers, if the smallest and the largest are labeled 1, then the middle one must also be labeled 1, so not all labelings are possible.
# f is a linear classifier">straight line

In geometry, a straight line, usually abbreviated line, is an infinitely long object with no width, depth, or curvature, an idealization of such physical objects as a straightedge, a taut string, or a ray of light. Lines are spaces of dimens ...

Definitions

VC dimension of a set-family

VC dimension of a classification model

Examples

Uses

In statistical learning theory

In computational geometry

Bounds

Examples of VC Classes

VC dimension of a finite projective plane

VC dimension of a boosting classifier

VC dimension of a neural network

Generalizations

See also

Footnotes

References