Backfitting Algorithm

	Backfitting Algorithm In statistics, the backfitting algorithm is a simple iterative procedure used to fit a generalized additive model. It was introduced in 1985 by Leo Breiman and Jerome Friedman along with generalized additive models. In most cases, the backfitting algorithm is equivalent to the Gauss–Seidel method, an algorithm used for solving a certain linear system of equations. Algorithm Additive models are a class of non-parametric regression models of the form: : Y_i = \alpha + \sum_^p f_j(X_) + \epsilon_i where each X_1, X_2, \ldots, X_p is a variable in our p-dimensional predictor X, and Y is our outcome variable. \epsilon represents our inherent error, which is assumed to have mean zero. The f_j represent unspecified smooth functions of a single X_j. Given the flexibility in the f_j, we typically do not have a unique solution: \alpha is left unidentifiable as one can add any constants to any of the f_j and subtract this value from \alpha. It is common to rectify this by constr ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.Dodge, Y. (2006) ''The Oxford Dictionary of Statistical Terms'', Oxford University Press. When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample to the population as a whole. An ex ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Generalized Additive Model In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear response variable depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions. GAMs were originally developed by Trevor Hastie and Robert Tibshirani to blend properties of generalized linear models with additive models. They can be interpreted as the discriminative generalization of the naive Bayes generative model. The model relates a univariate response variable, ''Y'', to some predictor variables, ''x''''i''. An exponential family distribution is specified for Y (for example normal, binomial or Poisson distributions) along with a link function ''g'' (for example the identity or log functions) relating the expected value of ''Y'' to the predictor variables via a structure such as : g(\operatorname(Y))=\beta_0 + f_1(x_1) + f_2(x_2)+ \cdots + f_m(x_m).\,\! The functions ''f''''i'' may be functions w ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Leo Breiman Leo Breiman (January 27, 1928 – July 5, 2005) was a distinguished statistician at the University of California, Berkeley. He was the recipient of numerous honors and awards, and was a member of the United States National Academy of Sciences. Breiman's work helped to bridge the gap between statistics and computer science, particularly in the field of machine learning. His most important contributions were his work on classification and regression trees and ensembles of trees fit to bootstrap samples. Bootstrap aggregation was given the name ''bagging'' by Breiman. Another of Breiman's ensemble approaches is the random forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th .... See also * Shannon–McMillan–Breiman theorem Further reading * Leo Breimaobituary from t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Jerome H Jerome (; la, Eusebius Sophronius Hieronymus; grc-gre, Εὐσέβιος Σωφρόνιος Ἱερώνυμος; – 30 September 420), also known as Jerome of Stridon, was a Christian priest, confessor, theologian, and historian; he is commonly known as Saint Jerome. Jerome was born at Stridon, a village near Emona on the border of Dalmatia and Pannonia. He is best known for his translation of the Bible into Latin (the translation that became known as the Vulgate) and his commentaries on the whole Bible. Jerome attempted to create a translation of the Old Testament based on a Hebrew version, rather than the Septuagint, as Latin Bible translations used to be performed before him. His list of writings is extensive, and beside his biblical works, he wrote polemical and historical essays, always from a theologian's perspective. Jerome was known for his teachings on Christian moral life, especially to those living in cosmopolitan centers such as Rome. In many cases, he ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Linear System Of Equations In mathematics, a system of linear equations (or linear system) is a collection of one or more linear equations involving the same variables. For example, :\begin 3x+2y-z=1\\ 2x-2y+4z=-2\\ -x+\fracy-z=0 \end is a system of three equations in the three variables . A solution to a linear system is an assignment of values to the variables such that all the equations are simultaneously satisfied. A solution to the system above is given by the ordered triple :(x,y,z)=(1,-2,-2), since it makes all three equations valid. The word "system" indicates that the equations are to be considered collectively, rather than individually. In mathematics, the theory of linear systems is the basis and a fundamental part of linear algebra, a subject which is used in most parts of modern mathematics. Computational algorithms for finding the solutions are an important part of numerical linear algebra, and play a prominent role in engineering, physics, chemistry, computer science, and economics. A sy ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Nonparametric Regression Nonparametric regression is a category of regression analysis in which the predictor does not take a predetermined form but is constructed according to information derived from the data. That is, no parametric form is assumed for the relationship between predictors and dependent variable. Nonparametric regression requires larger sample sizes than regression based on parametric models because the data must supply the model structure as well as the model estimates. Definition In nonparametric regression, we have random variables X and Y and assume the following relationship: : \mathbb \mid X=x= m(x), where m(x) is some deterministic function. Linear regression is a restricted case of nonparametric regression where m(x) is assumed to be affine. Some authors use a slightly stronger assumption of additive noise: : Y = m(X) + U, where the random variable U is the `noise term', with mean 0. Without the assumption that m belongs to a specific parametric family of functions it is impos ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Smoothing Spline Smoothing splines are function estimates, \hat f(x), obtained from a set of noisy observations y_i of the target f(x_i), in order to balance a measure of goodness of fit of \hat f(x_i) to y_i with a derivative based measure of the smoothness of \hat f(x). They provide a means for smoothing noisy x_i, y_i data. The most familiar example is the cubic smoothing spline, but there are many other possibilities, including for the case where x is a vector quantity. Cubic spline definition Let \ be a set of observations, modeled by the relation Y_i = f(x_i) + \epsilon_i where the \epsilon_i are independent, zero mean random variables (usually assumed to have constant variance). The cubic smoothing spline estimate \hat f of the function f is defined to be the minimizer (over the class of twice differentiable functions) of : \sum_^n \^2 + \lambda \int \hat f''(x)^2 \,dx. Remarks: * \lambda \ge 0 is a smoothing parameter, controlling the trade-off between fidelity to the data and roughnes ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Polynomial Regression In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable ''x'' and the dependent variable ''y'' is modelled as an ''n''th degree polynomial in ''x''. Polynomial regression fits a nonlinear relationship between the value of ''x'' and the corresponding conditional mean of ''y'', denoted E(''y'' , ''x''). Although ''polynomial regression'' fits a nonlinear model to the data, as a statistical estimation problem it is linear, in the sense that the regression function E(''y'' , ''x'') is linear in the unknown parameters that are estimated from the data. For this reason, polynomial regression is considered to be a special case of multiple linear regression. The explanatory (independent) variables resulting from the polynomial expansion of the "baseline" variables are known as higher-degree terms. Such variables are also used in classification settings. History Polynomial regression models are usu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Kernel Smoothing A kernel smoother is a statistical technique to estimate a real valued function f: \mathbb^p \to \mathbb as the weighted average of neighboring observed data. The weight is defined by the ''kernel'', such that closer points are given higher weights. The estimated function is smooth, and the level of smoothness is set by a single parameter. Kernel smoothing is a type of weighted moving average. Definitions Let K_(X_0 ,X) be a kernel defined by :K_(X_0 ,X) = D\left( \frac \right) where: * X,X_0 \in \mathbb^p * \left\, \cdot \right\, is the Euclidean norm * h_\lambda (X_0) is a parameter (kernel radius) * ''D''(''t'') is typically a positive real valued function, whose value is decreasing (or not increasing) for the increasing distance between the ''X'' and ''X''0. Popular kernels used for smoothing include parabolic (Epanechnikov), Tricube, and Gaussian kernels. Let Y(X):\mathbb^p \to \mathbb be a continuous function of ''X''. For each X_0 \in \mathbb^p, the Nadaraya-Watson ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Trevor Hastie Trevor John Hastie (born 27 June 1953) is an American statistician and computer scientist. He is currently serving as the John A. Overdeck Professor of Mathematical Sciences and Professor of Statistics at Stanford University. Hastie is known for his contributions to applied statistics, especially in the field of machine learning, data mining, and bioinformatics. He has authored several popular books in statistical learning, including ''The Elements of Statistical Learning: Data Mining, Inference, and Prediction''. Hastie has been listed as an ISI Highly Cited Author in Mathematics by the ISI Web of Knowledge. Education and career Hastie was born on 27 June 1953 in South Africa. He received his B.S. in statistics from the Rhodes University in 1976 and master's degree from University of Cape Town in 1979. Hastie joined the doctoral program at Stanford University in 1980 and received his Ph.D. in 1984 under the supervision of Werner Stuetzle. His dissertation was "Principal Curves ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Robert Tibshirani Robert Tibshirani (born July 10, 1956) is a professor in the Departments of Statistics and Biomedical Data Science at Stanford University. He was a professor at the University of Toronto from 1985 to 1998. In his work, he develops statistical tools for the analysis of complex datasets, most recently in genomics and proteomics. His most well-known contributions are the Lasso method, which proposed the use of L1 penalization in regression and related problems, and Significance Analysis of Microarrays. Education and early life Tibshirani was born on 10 July 1956 in Niagara Falls, Ontario, Canada. He received his B. Math. in statistics and computer science from the University of Waterloo in 1979 and a Master's degree in Statistics from University of Toronto in 1980. Tibshirani joined the doctoral program at Stanford University in 1981 and received his Ph.D. in 1984 under the supervision of Bradley Efron. His dissertation was entitled "Local likelihood estimation". Honors and ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]