Vapnik–Chervonenkis Theory

	Vapnik–Chervonenkis Theory Vapnik–Chervonenkis theory (also known as VC theory) was developed during 1960–1990 by Vladimir Vapnik and Alexey Chervonenkis. The theory is a form of computational learning theory, which attempts to explain the learning process from a statistical point of view. Introduction VC theory covers at least four parts (as explained in ''The Nature of Statistical Learning Theory''): Theory of consistency of learning processes What are (necessary and sufficient) conditions for consistency of a learning process based on the empirical risk minimization principle? Nonasymptotic theory of the rate of convergence of learning processes *How fast is the rate of convergence of the learning process? Theory of controlling the generalization ability of learning processes *How can one control the rate of convergence (the generalization ability) of the learning process? Theory of constructing learning machines How can one construct algorithms that can control the generalization abilit ... [...More Info...] [...Related Items...] OR:** [Wikipedia] [Google] [Baidu]
	Vladimir Vapnik Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machine method, and support-vector clustering algorithm. Early life and education Vladimir Vapnik was born to a Jewish family in the Soviet Union. He received his master's degree in mathematics from the Uzbek State University, Samarkand, Uzbek SSR in 1958 and Ph.D in statistics at the Institute of Control Sciences, Moscow in 1964. He worked at this institute from 1961 to 1990 and became Head of the Computer Science Research Department. Academic career At the end of 1990, Vladimir Vapnik moved to the USA and joined the Adaptive Systems Research Department at AT&T Bell Labs in Holmdel, New Jersey. While at AT&T, Vapnik and his colleagues did work on the support-vector machine, which he also worked on much earlier before moving to the USA. They de ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Central Limit Theorem In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions. This theorem has seen many changes during the formal development of probability theory. Previous versions of the theorem date back to 1811, but in its modern general form, this fundamental result in probability theory was precisely stated as late as 1920, thereby serving as a bridge between classical and modern probability theory. If X_1, X_2, \dots, X_n, \dots are random samples drawn from a population with overall mean \mu and finite variance and if \bar_n is the sample mean of t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Richard M Richard is a male given name. It originates, via Old French, from Old Frankish and is a compound of the words descending from Proto-Germanic ''rīk-'' 'ruler, leader, king' and ''hardu-'' 'strong, brave, hardy', and it therefore means 'strong in rule'. Nicknames include "Richie", " Dick", "Dickon", " Dickie", "Rich", "Rick", "Rico", "Ricky", and more. Richard is a common English, German and French male name. It's also used in many more languages, particularly Germanic, such as Norwegian, Danish, Swedish, Icelandic, and Dutch, as well as other languages including Irish, Scottish, Welsh and Finnish. Richard is cognate with variants of the name in other European languages, such as the Swedish "Rickard", the Catalan "Ricard" and the Italian "Riccardo", among others (see comprehensive variant list below). People named Richard Multiple people with the same name * Richard Andersen (other) * Richard Anderson (other) * Richard Cartwright (other) * ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Springer-Verlag Springer Science+Business Media, commonly known as Springer, is a German multinational publishing company of books, e-books and peer-reviewed journals in science, humanities, technical and medical (STM) publishing. Originally founded in 1842 in Berlin, it expanded internationally in the 1960s, and through mergers in the 1990s and a sale to venture capitalists it fused with Wolters Kluwer and eventually became part of Springer Nature in 2015. Springer has major offices in Berlin, Heidelberg, Dordrecht, and New York City. History Julius Springer founded Springer-Verlag in Berlin in 1842 and his son Ferdinand Springer grew it from a small firm of 4 employees into Germany's then second largest academic publisher with 65 staff in 1872.Chronology ". Springer Science+Business Media. In 1964, Springer expanded its business internationally, o ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	0-1 Loss In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost" associated with the event. An optimization problem seeks to minimize a loss function. An objective function is either a loss function or its opposite (in specific domains, variously called a reward function, a profit function, a utility function, a fitness function, etc.), in which case it is to be maximized. The loss function could include terms from several levels of the hierarchy. In statistics, typically a loss function is used for parameter estimation, and the event in question is some function of the difference between estimated and true values for an instance of data. The concept, as old as Laplace, was reintroduced in statistics by Abraham Wald in the middle of the 20th century. In the context of economics, for example, this is ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Shattered Set The concept of shattered sets plays an important role in Vapnik–Chervonenkis theory, also known as VC-theory. Shattering and VC-theory are used in the study of empirical processes as well as in statistical computational learning theory. Definition Suppose ''A'' is a set (mathematics), set and ''C'' is a class (set theory), class of sets. The class ''C'' shatters the set ''A'' if for each subset ''a'' of ''A'', there is some element ''c'' of ''C'' such that : a = c \cap A. Equivalently, ''C'' shatters ''A'' when their Growth function#Definitions, intersection is equal to ''As power set: ''P''(''A'') = . We employ the letter ''C'' to refer to a "class" or "collection" of sets, as in a Vapnik–Chervonenkis class (VC-class). The set ''A'' is often assumed to be finite set, finite because, in empirical processes, we are interested in the shattering of finite sets of data points. Example We will show that the class of all disc (geometry), discs in the plane (geometry) ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, agriculture, and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.Hu, J.; Niu, H.; Carrasco, J.; Lennox, B.; Arvin, F.,Voronoi-Based Multi-Robot Autonomous Exploration in Unknown Environments via Deep Reinforcement Learning IEEE Transactions on Vehicular Technology, 2020. A subset of machine learning is closely related to computational statistics, which focuses on making predicti ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Hypograph (mathematics) In mathematics, the hypograph or subgraph of a function f:\R^\rightarrow \R is the set of points lying on or below its graph. A related definition is that of such a function's epigraph, which is the set of points on or above the function's graph. The domain (rather than the codomain) of the function is not particularly important for this definition; it can be an arbitrary set instead of \mathbb^n. Definition The definition of the hypograph was inspired by that of the graph of a function, where the of f : X \to Y is defined to be the set :\operatorname f := \left\. The or of a function f : X \to \infty, \infty/math> valued in the extended real numbers \infty, \infty= \mathbb \cup \ is the set : \begin \operatorname f &= \left\ \\ &= \left f^(\infty) \times \mathbb \right\cup \bigcup_ \ \times (-\infty, f(x)]. \end Similarly, the set of points on or above the function is its epigraph. The is the hypograph with the graph removed: : \begin \operatorname_S f ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Sauer–Shelah Lemma In combinatorial mathematics and extremal set theory, the Sauer–Shelah lemma states that every family of sets with small VC dimension consists of a small number of sets. It is named after Norbert Sauer and Saharon Shelah, who published it independently of each other in 1972. The same result was also published slightly earlier and again independently, by Vladimir Vapnik and Alexey Chervonenkis, after whom the VC dimension is named. In his paper containing the lemma, Shelah gives credit also to Micha Perles, and for this reason the lemma has also been called the Perles–Sauer–Shelah lemma.. Buzaglo et al. call this lemma "one of the most fundamental results on VC-dimension", and it has applications in many areas. Sauer's motivation was in the combinatorics of set systems, while Shelah's was in model theory and that of Vapnik and Chervonenkis was in statistics. It has also been applied in discrete geometry. and graph theory.. Definitions and statement If \textstyle \mathcal=\ ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Jensen's Inequality In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier proof of the same inequality for doubly-differentiable functions by Otto Hölder in 1889. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation; it is a simple corollary that the opposite is true of concave transformations. Jensen's inequality generalizes the statement that the secant line of a convex function lies ''above'' the graph of the function, which is Jensen's inequality for two points: the secant line consists of weighted means of the convex function (for ''t'' ∈ ,1, :t f(x_1) + (1-t) f(x_2), while t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Hoeffding's Inequality In probability theory, Hoeffding's inequality provides an upper bound on the probability that the sum of bounded independent random variables deviates from its expected value by more than a certain amount. Hoeffding's inequality was proven by Wassily Hoeffding in 1963. Hoeffding's inequality is a special case of the Azuma–Hoeffding inequality and McDiarmid's inequality. It is similar to the Chernoff bound, but tends to be less sharp, in particular when the variance of the random variables is small. It is similar to, but incomparable with, one of Bernstein's inequalities. Statement Let be independent random variables such that a_i \leq X_i \leq b_i almost surely. Consider the sum of these random variables, :S_n = X_1 + \cdots + X_n. Then Hoeffding's theorem states that, for all , :\begin \operatorname \left(S_n - \mathrm\left _n \right\geq t \right) &\leq \exp \left(-\frac \right) \\ \operatorname \left(\left , S_n - \mathrm\left _n \right\right , \geq t \right) &\leq 2\ ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Rademacher Complexity In computational learning theory (machine learning and theory of computation), Rademacher complexity, named after Hans Rademacher, measures richness of a class of real-valued functions with respect to a probability distribution. Definitions Rademacher complexity of a set Given a set A\subseteq \mathbb^m, the Rademacher complexity of ''A'' is defined as follows:Chapter 26 in : \operatorname(A) := \frac \mathbb_\sigma \left \sup_ \sum_^m \sigma_i a_i \right where \sigma_1, \sigma_2, \dots, \sigma_m are independent random variables drawn from the Rademacher distribution i.e. \Pr(\sigma_i = +1) = \Pr(\sigma_i = -1) = 1/2 for i=1,2,\dots,m, and a=(a_1, \ldots, a_m). Some authors take the absolute value of the sum before taking the supremum, but if A is symmetric this makes no difference. Rademacher complexity of a function class Let S=(z_1, z_2, \dots, z_m) \in Z^m be a sample of points and consider a function class \mathcal of real-valued functions over Z^m. Then, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]