HOME

TheInfoList



OR:

Stan is a probabilistic programming language for
statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properti ...
written in C++.Stan Development Team. 2015
Stan Modeling Language User's Guide and Reference Manual, Version 2.9.0
/ref> The Stan language is used to specify a (Bayesian)
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
with an imperative program calculating the log probability density function. Stan is licensed under the
New BSD License BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lice ...
. Stan is named in honour of Stanislaw Ulam, pioneer of the Monte Carlo method. Stan was created by a development team consisting of 34 members that includes
Andrew Gelman Andrew Eric Gelman (born February 11, 1965) is an American statistician and professor of statistics and political science at Columbia University. Gelman received bachelor of science degrees in mathematics and in physics from MIT, where he w ...
, Bob Carpenter, Matt Hoffman, and Daniel Lee.


Interfaces

The Stan language itself can be accessed through several interfaces: * CmdStan – a command-line executable for the shell, * CmdStanR and rstan – R software libraries, * CmdStanPy and PyStan – libraries for the Python programming language, * MatlabStan – integration with the
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
numerical computing environment, * Stan.jl – integration with the
Julia programming language Julia is a high-level, dynamic programming language. Its features are well suited for numerical analysis and computational science. Distinctive aspects of Julia's design include a type system with parametric polymorphism in a dynamic program ...
, * StataStan – integration with
Stata Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers in many fie ...
. In addition, higher-level interfaces are provided with packages using Stan as backend, primarily in the R language: * ''rstanarm'' provides a drop-in replacement for frequentist models provided by base R and ''lme4'' using the R formula syntax; * ''brms'' provides a wide array of linear and nonlinear models using the R formula syntax; * ''prophet'' provides automated procedures for time series forecasting.


Algorithms

Stan implements gradient-based Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference, stochastic, gradient-based variational Bayesian methods for approximate Bayesian inference, and gradient-based optimization for penalized maximum likelihood estimation. * MCMC algorithms: **
Hamiltonian Monte Carlo The Hamiltonian Monte Carlo algorithm (originally known as hybrid Monte Carlo) is a Markov chain Monte Carlo method for obtaining a sequence of random samples which converge to being distributed according to a target probability distribution for ...
(HMC) ** No-U-Turn sampler (NUTS), a variant of HMC and Stan's default MCMC engine * Variational inference algorithms: ** Automatic Differentiation Variational Inference * Optimization algorithms: **
Limited-memory BFGS Limited-memory BFGS (L-BFGS or LM-BFGS) is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular a ...
(Stan's default optimization algorithm) **
Broyden–Fletcher–Goldfarb–Shanno algorithm In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems. Like the related Davidon–Fletcher–Powell method, BFGS determines the ...
**
Laplace's method In mathematics, Laplace's method, named after Pierre-Simon Laplace, is a technique used to approximate integrals of the form :\int_a^b e^ \, dx, where f(x) is a twice- differentiable function, ''M'' is a large number, and the endpoints ''a'' ...
for classical standard error estimates and approximate Bayesian posteriors


Automatic differentiation

Stan implements reverse-mode automatic differentiation to calculate gradients of the model, which is required by HMC, NUTS, L-BFGS, BFGS, and variational inference. The automatic differentiation within Stan can be used outside of the probabilistic programming language.


Usage

Stan is used in fields including social science, pharmaceutical statistics,
market research Market research is an organized effort to gather information about target markets and customers: know about them, starting with who they are. It is an important component of business strategy and a major factor in maintaining competitiveness. Ma ...
, and medical imaging.


References


Further reading

* * Gelman, Andrew, Daniel Lee, and Jiqiang Guo (2015).
Stan: A probabilistic programming language for Bayesian inference and optimization
Journal of Educational and Behavioral Statistics. * Hoffman, Matthew D., Bob Carpenter, and Andrew Gelman (2012)
Stan, scalable software for Bayesian modeling
, Proceedings of the NIPS Workshop on Probabilistic Programming.


External links


Stan web site

Stan source
a Git repository hosted on
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, cont ...
{{Statistical software Computational statistics Free Bayesian statistics software Monte Carlo software Numerical programming languages Domain-specific programming languages Probabilistic software