HOME

TheInfoList



OR:

In
mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
, a smooth maximum of an
indexed family In mathematics, a family, or indexed family, is informally a collection of objects, each associated with an index from some index set. For example, a ''family of real numbers, indexed by the set of integers'' is a collection of real numbers, whe ...
''x''1, ..., ''x''''n'' of numbers is a
smooth approximation In mathematical analysis, the smoothness of a function (mathematics), function is a property measured by the number of Continuous function, continuous Derivative (mathematics), derivatives it has over some domain, called ''differentiability cl ...
to the
maximum In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given ran ...
function \max(x_1,\ldots,x_n), meaning a
parametric family In mathematics and its applications, a parametric family or a parameterized family is a indexed family, family of objects (a set of related objects) whose differences depend only on the chosen values for a set of parameters. Common examples are p ...
of functions m_\alpha(x_1,\ldots,x_n) such that for every , the function is smooth, and the family converges to the maximum function as . The concept of smooth minimum is similarly defined. In many cases, a single family approximates both: maximum as the parameter goes to positive infinity, minimum as the parameter goes to negative infinity; in symbols, as and as . The term can also be used loosely for a specific smooth function that behaves similarly to a maximum, without necessarily being part of a parametrized family.


Examples

For large positive values of the parameter \alpha > 0, the following formulation is a smooth,
differentiable In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non-vertical tangent line at each interior point in its ...
approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum. : \mathcal_\alpha (x_1,\ldots,x_n) = \frac \mathcal_\alpha has the following properties: #\mathcal_\alpha\to \max as \alpha\to\infty #\mathcal_0 is the
arithmetic mean In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the ''mean'' or the ''average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The colle ...
of its inputs #\mathcal_\alpha\to \min as \alpha\to -\infty The gradient of \mathcal_ is closely related to softmax and is given by : \nabla_\mathcal_\alpha (x_1,\ldots,x_n) = \frac + \alpha(x_i - \mathcal_\alpha (x_1,\ldots,x_n)) This makes the softmax function useful for optimization techniques that use
gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the ...
.


LogSumExp

Another smooth maximum is
LogSumExp The LogSumExp (LSE) (also called RealSoftMax or multivariable softplus) function is a smooth maximum – a smooth approximation to the maximum function, mainly used by machine learning algorithms. It is defined as the logarithm of the sum of the e ...
: : \mathrm_\alpha(x_1, \ldots, x_n) = (1/\alpha)\log( \exp(\alpha x_1) + \ldots + \exp( \alpha x_n)) This can also be normalized if the x_i are all non-negative, yielding a function with domain [0,\infty)^n and range [0, \infty): : g(x_1, \ldots, x_n) = \log( \exp(x_1) + \ldots + \exp(x_n) - (n - 1) ) The (n - 1) term corrects for the fact that \exp(0) = 1 by canceling out all but one zero exponential, and \log 1 = 0 if all x_i are zero.


p-Norm

Another smooth maximum is the p-norm: : , , (x_1, \ldots, x_n) , , _p = \left( , x_1, ^p + \cdots + , x_n, ^p \right)^ which converges to , , (x_1, \ldots, x_n) , , _\infty = \max_ , x_i, as p \to \infty. An advantage of the p-norm is that it is a
norm Naturally occurring radioactive materials (NORM) and technologically enhanced naturally occurring radioactive materials (TENORM) consist of materials, usually industrial wastes or by-products enriched with radioactive elements found in the envir ...
. As such it is "scale invariant" (homogeneous): , , (\lambda x_1, \ldots, \lambda x_n) , , _p = , \lambda, \times , , (x_1, \ldots, x_n) , , _p , and it satisfies the triangular inequality.


Other choices of smoothing function

: \begin \textstyle\max_\varepsilon(a, b) &= \frac \\ &= \frac \end where \varepsilon \geq 0 is a parameter. As \varepsilon \to 0, , \cdot, _\varepsilon \to , \cdot, and thus \textstyle\max_\varepsilon \to \max.


See also

*
LogSumExp The LogSumExp (LSE) (also called RealSoftMax or multivariable softplus) function is a smooth maximum – a smooth approximation to the maximum function, mainly used by machine learning algorithms. It is defined as the logarithm of the sum of the e ...
*
Softmax function The softmax function, also known as softargmax or normalized exponential function, converts a vector of real numbers into a probability distribution of possible outcomes. It is a generalization of the logistic function to multiple dimensions, a ...
*
Generalized mean In mathematics, generalized means (or power mean or Hölder mean from Otto Hölder) are a family of functions for aggregating sets of numbers. These include as special cases the Pythagorean means (arithmetic, geometric, and harmonic means). De ...


References

{{Reflist Mathematical notation Basic concepts in set theory https://www.johndcook.com/soft_maximum.pdf M. Lange, D. Zühlke, O. Holz, and T. Villmann, "Applications of lp-norms and their smooth approximations for gradient based learning vector quantization," ''in Proc. ESANN'', Apr. 2014, pp. 271-276. (https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2014-153.pdf)