Sigmoid Function
   HOME

TheInfoList



OR:

A sigmoid function is a
mathematical function In mathematics, a function from a set to a set assigns to each element of exactly one element of .; the words map, mapping, transformation, correspondence, and operator are often used synonymously. The set is called the domain of the functi ...
having a characteristic "S"-shaped curve or sigmoid curve. A common example of a sigmoid function is the
logistic function A logistic function or logistic curve is a common S-shaped curve (sigmoid curve) with equation f(x) = \frac, where For values of x in the domain of real numbers from -\infty to +\infty, the S-curve shown on the right is obtained, with the ...
shown in the first figure and defined by the formula: :S(x) = \frac = \frac=1-S(-x). Other standard sigmoid functions are given in the Examples section. In some fields, most notably in the context of
artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
s, the term "sigmoid function" is used as an alias for the logistic function. Special cases of the sigmoid function include the
Gompertz curve The Gompertz curve or Gompertz function is a type of mathematical model for a time series, named after Benjamin Gompertz (1779–1865). It is a sigmoid function which describes growth as being slowest at the start and end of a given time period. Th ...
(used in modeling systems that saturate at large values of x) and the
ogee curve An ogee ( ) is the name given to objects, elements, and curves—often seen in architecture and building trades—that have been variously described as serpentine-, extended S-, or sigmoid-shaped. Ogees consist of a "double curve", the combinatio ...
(used in the
spillway A spillway is a structure used to provide the controlled release of water downstream from a dam or levee, typically into the riverbed of the dammed river itself. In the United Kingdom, they may be known as overflow channels. Spillways ensure tha ...
of some
dam A dam is a barrier that stops or restricts the flow of surface water or underground streams. Reservoirs created by dams not only suppress floods but also provide water for activities such as irrigation, human consumption, industrial use ...
s). Sigmoid functions have domain of all
real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every real ...
s, with return (response) value commonly
monotonically increasing In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of order ...
but could be decreasing. Sigmoid functions most often show a return value (y axis) in the range 0 to 1. Another commonly used range is from −1 to 1. A wide variety of sigmoid functions including the logistic and
hyperbolic tangent In mathematics, hyperbolic functions are analogues of the ordinary trigonometric functions, but defined using the hyperbola rather than the circle. Just as the points form a circle with a unit radius, the points form the right half of the un ...
functions have been used as the
activation function In artificial neural networks, the activation function of a node defines the output of that node given an input or set of inputs. A standard integrated circuit can be seen as a digital network of activation functions that can be "ON" (1) or " ...
of
artificial neuron An artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. Artificial neurons are elementary units in an artificial neural network. The artificial neuron receives one or more inputs (representing e ...
s. Sigmoid curves are also common in statistics as
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
s (which go from 0 to 1), such as the integrals of the
logistic density Logistic may refer to: Mathematics * Logistic function, a sigmoid function used in many fields ** Logistic map, a recurrence relation that sometimes exhibits chaos ** Logistic regression, a statistical model using the logistic function ** Logit ...
, the normal density, and Student's ''t'' probability density functions. The logistic sigmoid function is invertible, and its inverse is the
logit In statistics, the logit ( ) function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in data transformations. Mathematically, the logit is the ...
function.


Definition

A sigmoid function is a bounded,
differentiable In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non-vertical tangent line at each interior point in its ...
, real function that is defined for all real input values and has a non-negative derivative at each point and exactly one
inflection point In differential calculus and differential geometry, an inflection point, point of inflection, flex, or inflection (British English: inflexion) is a point on a smooth plane curve at which the curvature changes sign. In particular, in the case of ...
. A sigmoid "function" and a sigmoid "curve" refer to the same object.


Properties

In general, a sigmoid function is
monotonic In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of order ...
, and has a first
derivative In mathematics, the derivative of a function of a real variable measures the sensitivity to change of the function value (output value) with respect to a change in its argument (input value). Derivatives are a fundamental tool of calculus. F ...
which is bell shaped. Conversely, the
integral In mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented i ...
of any continuous, non-negative, bell-shaped function (with one local maximum and no local minimum, unless degenerate) will be sigmoidal. Thus the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
s for many common
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
s are sigmoidal. One such example is the
error function In mathematics, the error function (also called the Gauss error function), often denoted by , is a complex function of a complex variable defined as: :\operatorname z = \frac\int_0^z e^\,\mathrm dt. This integral is a special (non-elementary ...
, which is related to the cumulative distribution function of a
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
; another is the
arctan In mathematics, the inverse trigonometric functions (occasionally also called arcus functions, antitrigonometric functions or cyclometric functions) are the inverse functions of the trigonometric functions (with suitably restricted domains). Spec ...
function, which is related to the cumulative distribution function of a
Cauchy distribution The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution (after Hendrik Lorentz), Cauchy–Lorentz distribution, Lorentz(ian) fun ...
. A sigmoid function is constrained by a pair of horizontal asymptotes as x \rightarrow \pm \infty. A sigmoid function is
convex Convex or convexity may refer to: Science and technology * Convex lens, in optics Mathematics * Convex set, containing the whole line segment that joins points ** Convex polygon, a polygon which encloses a convex set of points ** Convex polytope ...
for values less than a particular point, and it is
concave Concave or concavity may refer to: Science and technology * Concave lens * Concave mirror Mathematics * Concave function, the negative of a convex function * Concave polygon, a polygon which is not convex * Concave set * The concavity In ca ...
for values greater than that point: in many of the examples here, that point is 0.


Examples

*
Logistic function A logistic function or logistic curve is a common S-shaped curve (sigmoid curve) with equation f(x) = \frac, where For values of x in the domain of real numbers from -\infty to +\infty, the S-curve shown on the right is obtained, with the ...
f(x) = \frac *
Hyperbolic tangent In mathematics, hyperbolic functions are analogues of the ordinary trigonometric functions, but defined using the hyperbola rather than the circle. Just as the points form a circle with a unit radius, the points form the right half of the un ...
(shifted and scaled version of the logistic function, above) f(x) = \tanh x = \frac * Arctangent function f(x) = \arctan x *
Gudermannian function In mathematics, the Gudermannian function relates a hyperbolic angle measure \psi to a circular angle measure \phi called the ''gudermannian'' of \psi and denoted \operatorname\psi. The Gudermannian function reveals a close relationship betwee ...
f(x) = \operatorname(x) = \int_0^x \frac = 2\arctan\left(\tanh\left(\frac\right)\right) *
Error function In mathematics, the error function (also called the Gauss error function), often denoted by , is a complex function of a complex variable defined as: :\operatorname z = \frac\int_0^z e^\,\mathrm dt. This integral is a special (non-elementary ...
f(x) = \operatorname(x) = \frac \int_0^x e^ \, dt *
Generalised logistic function The generalized logistic function or curve is an extension of the logistic or sigmoid functions. Originally developed for growth modelling, it allows for more flexible S-shaped curves. The function is sometimes named Richards's curve after ...
f(x) = \left(1 + e^ \right)^, \quad \alpha > 0 *
Smoothstep Smoothstep is a family of sigmoid-like interpolation and clamping functions commonly used in computer graphics, video game engines, and machine learning. The function depends on three parameters, the input ''x'', the "left edge" and the "righ ...
function f(x) = \begin , & , x, \le 1 \\ \\ \sgn(x) & , x, \ge 1 \\ \end \quad N \in \mathbb \ge 1 * Some
algebraic function In mathematics, an algebraic function is a function that can be defined as the root of a polynomial equation. Quite often algebraic functions are algebraic expressions using a finite number of terms, involving only the algebraic operations additi ...
s, for example f(x) = \frac * and in a more general form f(x) = \frac * Up to shifts and scaling, many sigmoids are special cases of f(x) = \varphi(\varphi(x, \beta), \alpha) , where \varphi(x, \lambda) = \begin (1 - \lambda x)^ & \lambda \ne 0 \\e^ & \lambda = 0 \\ \end is the inverse of the negative
Box–Cox transformation In statistics, a power transform is a family of functions applied to create a monotonic transformation of data using power functions. It is a data transformation technique used to stabilize variance, make the data more normal distribution-like, i ...
, and \alpha < 1 and \beta < 1 are shape parameters. * Smooth Interpolation normalized to (-1,1) and n is the slope at zero: \beginf(x) &= \begin , n=2 & , x, < 1 \\ \\ \sgn(x) & , x, \ge 1 \\ \end \\ &= \begin , n=2 & , x, < 1 \\ \\ \sgn(x) & , x, \ge 1 \\ \end\end using the hyperbolic tangent mentioned above.


Applications

Many natural processes, such as those of complex system
learning curve A learning curve is a graphical representation of the relationship between how Skill, proficient people are at a task and the amount of experience they have. Proficiency (measured on the vertical axis) usually increases with increased experience ...
s, exhibit a progression from small beginnings that accelerates and approaches a climax over time. When a specific mathematical model is lacking, a sigmoid function is often used. The
van Genuchten–Gupta model The Van Genuchten–Gupta model is an inverted Sigmoid function, S-curve applicable to crop yield and soil salinity relations.M. Th. van Genuchten and S.K. Gupta, 1993. USDA-ARS, U.S. Salinity Laboratory 4500 Glenwood Drive, Riverside, California, ...
is based on an inverted S-curve and applied to the response of crop yield to
soil salinity Soil salinity is the salt content in the soil; the process of increasing the salt content is known as salinization. Salts occur naturally within soils and water. Salination can be caused by natural processes such as mineral weathering or by the ...
. Examples of the application of the logistic S-curve to the response of crop yield (wheat) to both the soil salinity and depth to
water table The water table is the upper surface of the zone of saturation. The zone of saturation is where the pores and fractures of the ground are saturated with water. It can also be simply explained as the depth below which the ground is saturated. T ...
in the soil are shown in modeling crop response in agriculture. In
artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
s, sometimes non-smooth functions are used instead for efficiency; these are known as
hard sigmoid In artificial intelligence, especially computer vision and artificial neural networks, a hard sigmoid is non-smooth function used in place of a sigmoid function. These retain the basic shape of a sigmoid, rising from 0 to 1, but using simpler functi ...
s. In
audio signal processing Audio signal processing is a subfield of signal processing that is concerned with the electronic manipulation of audio signals. Audio signals are electronic representations of sound waves—longitudinal waves which travel through air, consisting ...
, sigmoid functions are used as
waveshaper In electronic music, waveshaping is a type of distortion synthesis in which complex spectra are produced from simple tones by altering the shape of the waveforms. Uses Waveshapers are used mainly by electronic musicians to achieve an extra-a ...
transfer function In engineering, a transfer function (also known as system function or network function) of a system, sub-system, or component is a function (mathematics), mathematical function that mathematical model, theoretically models the system's output for ...
s to emulate the sound of
analog circuitry Analogue electronics ( en-US, analog electronics) are electronic systems with a continuously variable signal, in contrast to digital electronics where signals usually take only two levels. The term "analogue" describes the proportional relati ...
clipping Clipping may refer to: Words * Clipping (morphology), the formation of a new word by shortening it, e.g. "ad" from "advertisement" * Clipping (phonetics), shortening the articulation of a speech sound, usually a vowel * Clipping (publications) ...
. In
biochemistry Biochemistry or biological chemistry is the study of chemical processes within and relating to living organisms. A sub-discipline of both chemistry and biology, biochemistry may be divided into three fields: structural biology, enzymology and ...
and
pharmacology Pharmacology is a branch of medicine, biology and pharmaceutical sciences concerned with drug or medication action, where a drug may be defined as any artificial, natural, or endogenous (from within the body) molecule which exerts a biochemica ...
, the
Hill A hill is a landform that extends above the surrounding terrain. It often has a distinct Summit (topography), summit. Terminology The distinction between a hill and a mountain is unclear and largely subjective, but a hill is universally con ...
and
Hill–Langmuir equation In biochemistry and pharmacology, the Hill equation refers to two closely related equations that reflect the binding of ligands to macromolecules, as a function of the ligand concentration. A ligand is "a substance that forms a complex with a bio ...
s are sigmoid functions. In computer graphics and real-time rendering, some of the sigmoid functions are used to blend colors or geometry between two values, smoothly and without visible seams or discontinuities.
Titration curve Titrations are often recorded on graphs called titration curves, which generally contain the volume of the titrant as the independent variable and the pH of the solution as the dependent variable (because it changes depending on the composition ...
s between strong acids and strong bases have a sigmoid shape due to the logarithmic nature of the
pH scale In chemistry, pH (), historically denoting "potential of hydrogen" (or "power of hydrogen"), is a scale used to specify the acidity or basicity of an aqueous solution. Acidic solutions (solutions with higher concentrations of ions) are me ...
. The logistic function can be calculated efficiently by utilizing type III Unums.


See also

*
Heaviside step function The Heaviside step function, or the unit step function, usually denoted by or (but sometimes , or ), is a step function, named after Oliver Heaviside (1850–1925), the value of which is zero for negative arguments and one for positive argume ...
*
Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear function (calculus), linear combination of one or more independent var ...
*
Logit In statistics, the logit ( ) function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in data transformations. Mathematically, the logit is the ...
* Softplus function *
Soboleva modified hyperbolic tangent The Soboleva modified hyperbolic tangent, also known as (parametric) Soboleva modified hyperbolic tangent activation function ( MHTAF), is a special S-shaped function based on the hyperbolic tangent, given by :\operatornamex = \frac . This fun ...
*
Softmax function The softmax function, also known as softargmax or normalized exponential function, converts a vector of real numbers into a probability distribution of possible outcomes. It is a generalization of the logistic function to multiple dimensions, a ...
*
Swish function The swish function is a mathematical function defined as follows: : \operatorname(x) = x \operatorname(\beta x) = \frac. where β is either constant or a trainable parameter depending on the model. For β = 1, the function becomes e ...
*
Weibull distribution In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice Ren ...
*
Fermi–Dirac statistics Fermi–Dirac statistics (F–D statistics) is a type of quantum statistics that applies to the physics of a system consisting of many non-interacting, identical particles that obey the Pauli exclusion principle. A result is the Fermi–Dirac di ...


References


Further reading

* . (NB. In particular see "Chapter 4: Artificial Neural Networks" (in particular pp. 96–97) where Mitchell uses the word "logistic function" and the "sigmoid function" synonymously – this function he also calls the "squashing function" – and the sigmoid (aka logistic) function is used to compress the outputs of the "neurons" in multi-layer neural nets.) * (NB. Properties of the sigmoid, including how it can shift along axes and how its domain may be transformed.)


External links

* {{Differentiable computing Elementary special functions Artificial neural networks