HOME

TheInfoList



OR:

In mathematics, the symmetry of second derivatives (also called the equality of mixed partials) refers to the possibility of interchanging the order of taking
partial derivative In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). Pa ...
s of a function :f\left(x_1,\, x_2,\, \ldots,\, x_n\right) of ''n'' variables without changing the result under certain conditions (see below). The symmetry is the assertion that the second-order partial derivatives satisfy the identity :\frac \left( \frac \right) \ = \ \frac \left( \frac \right) so that they form an ''n'' × ''n''
symmetric matrix In linear algebra, a symmetric matrix is a square matrix that is equal to its transpose. Formally, Because equal matrices have equal dimensions, only square matrices can be symmetric. The entries of a symmetric matrix are symmetric with ...
, known as the function's
Hessian matrix In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed ...
. This is sometimes known as Schwarz's theorem, Clairaut's theorem, or Young's theorem. In the context of
partial differential equation In mathematics, a partial differential equation (PDE) is an equation which imposes relations between the various partial derivatives of a multivariable function. The function is often thought of as an "unknown" to be solved for, similarly to ...
s it is called the Schwarz integrability condition.


Formal expressions of symmetry

In symbols, the symmetry may be expressed as: :\frac \left( \frac \right) \ = \ \frac \left( \frac \right) \qquad\text\qquad \frac \ =\ \frac . Another notation is: :\partial_x\partial_y f = \partial_y\partial_x f \qquad\text\qquad f_ = f_. In terms of composition of the differential operator which takes the partial derivative with respect to : :D_i \circ D_j = D_j \circ D_i. From this relation it follows that the ring of differential operators with constant coefficients, generated by the , is
commutative In mathematics, a binary operation is commutative if changing the order of the operands does not change the result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it. Most familiar as the name o ...
; but this is only true as operators over a domain of sufficiently differentiable functions. It is easy to check the symmetry as applied to
monomial In mathematics, a monomial is, roughly speaking, a polynomial which has only one term. Two definitions of a monomial may be encountered: # A monomial, also called power product, is a product of powers of variables with nonnegative integer expon ...
s, so that one can take
polynomial In mathematics, a polynomial is an expression consisting of indeterminates (also called variables) and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An ex ...
s in the as a domain. In fact
smooth function In mathematical analysis, the smoothness of a function is a property measured by the number of continuous derivatives it has over some domain, called ''differentiability class''. At the very minimum, a function could be considered smooth if ...
s are another valid domain.


History

The result on the equality of mixed partial derivatives under certain conditions has a long history. The list of unsuccessful proposed proofs started with
Euler Leonhard Euler ( , ; 15 April 170718 September 1783) was a Swiss mathematician, physicist, astronomer, geographer, logician and engineer who founded the studies of graph theory and topology and made pioneering and influential discoveries in ma ...
's, published in 1740, although already in 1721 Bernoulli had implicitly assumed the result with no formal justification. Clairaut also published a proposed proof in 1740, with no other attempts until the end of the 18th century. Starting then, for a period of 70 years, a number of incomplete proofs were proposed. The proof of
Lagrange Joseph-Louis Lagrange (born Giuseppe Luigi LagrangiaCauchy Baron Augustin-Louis Cauchy (, ; ; 21 August 178923 May 1857) was a French mathematician, engineer, and physicist who made pioneering contributions to several branches of mathematics, including mathematical analysis and continuum mechanics. He ...
(1823), but assumed the existence and continuity of the partial derivatives \tfrac and \tfrac. Other attempts were made by P. Blanchet (1841), Duhamel (1856),
Sturm Sturm (German for storm) may refer to: People * Sturm (surname), surname (includes a list) * Saint Sturm (died 779), 8th-century monk Food * Federweisser, known as ''Sturm'' in Austria, wine in the fermentation stage * Sturm Foods, an Ameri ...
(1857), Schlömilch (1862), and Bertrand (1864). Finally in 1867 Lindelöf systematically analyzed all the earlier flawed proofs and was able to exhibit a specific counterexample where mixed derivatives failed to be equal. Six years after that, Schwarz succeeded in giving the first rigorous proof. Dini later contributed by finding more general conditions than those of Schwarz. Eventually a clean and more general version was found by
Jordan Jordan ( ar, الأردن; tr. ' ), officially the Hashemite Kingdom of Jordan,; tr. ' is a country in Western Asia. It is situated at the crossroads of Asia, Africa, and Europe, within the Levant region, on the East Bank of the Jordan Ri ...
in 1883 that is still the proof found in most textbooks. Minor variants of earlier proofs were published by Laurent (1885), Peano (1889 and 1893), J. Edwards (1892), P. Haag (1893), J. K. Whittemore (1898), Vivanti (1899) and Pierpont (1905). Further progress was made in 1907-1909 when
E. W. Hobson Ernest William Hobson Fellow of the Royal Society, FRS (27 October 1856 – 19 April 1933) was an England, English mathematician, now remembered mostly for his books, some of which broke new ground in their coverage in English of topics fro ...
and
W. H. Young William Henry Young FRS ( London, 20 October 1863 – Lausanne, 7 July 1942) was an English mathematician. Young was educated at City of London School and Peterhouse, Cambridge. He worked on measure theory, Fourier series, differential ca ...
found proofs with weaker conditions than those of Schwarz and Dini. In 1918, Carathéodory gave a different proof based on the
Lebesgue integral In mathematics, the integral of a non-negative function of a single variable can be regarded, in the simplest case, as the area between the graph of that function and the -axis. The Lebesgue integral, named after French mathematician Henri Le ...
.


Theorem of Schwarz

In
mathematical analysis Analysis is the branch of mathematics dealing with continuous functions, limit (mathematics), limits, and related theories, such as Derivative, differentiation, Integral, integration, measure (mathematics), measure, infinite sequences, series (m ...
, Schwarz's theorem (or Clairaut's theorem on equality of mixed partials) named after Alexis Clairaut and Hermann Schwarz, states that for a function f \colon \Omega \to \mathbb defined on a set \Omega \subset \mathbb^n, if \mathbf\in \mathbb^n is a point such that some neighborhood of \mathbf is contained in \Omega and f has continuous second
partial derivatives In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). P ...
on that neighborhood of \mathbf, then for all and in \. : \fracf(\mathbf) = \fracf(\mathbf). The partial derivatives of this function commute at that point. One easy way to establish this theorem (in the case where n = 2, i = 1, and j = 2, which readily entails the result in general) is by applying
Green's theorem In vector calculus, Green's theorem relates a line integral around a simple closed curve to a double integral over the plane region bounded by . It is the two-dimensional special case of Stokes' theorem. Theorem Let be a positively ori ...
to the
gradient In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p is the "direction and rate of fastest increase". If the gr ...
of f. An elementary proof for functions on open subsets of the plane is as follows (by a simple reduction, the general case for the theorem of Schwarz easily reduces to the planar case). Let f(x,y) be a
differentiable function In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non- vertical tangent line at each interior point in ...
on an open rectangle containing a point (a,b) and suppose that df is continuous with continuous \partial_x \partial _y f and \partial_y\partial_x f over . Define :\begin u\left(h,\, k\right) &= f\left(a + h,\, b + k\right) - f\left(a + h,\, b\right), \\ v\left(h,\, k\right) &= f\left(a + h,\, b + k\right) - f\left(a,\, b + k\right), \\ w\left(h,\, k\right) &= f\left(a + h,\, b + k\right) - f\left(a + h,\, b\right) - f\left(a,\, b + k\right) + f\left(a,\, b\right). \end These functions are defined for \left, h\,\, \left, k\ < \varepsilon, where \varepsilon > 0 and \left - \varepsilon,\, a + \varepsilon\right\times \left - \varepsilon,\, b + \varepsilon\right/math> is contained in . By the
mean value theorem In mathematics, the mean value theorem (or Lagrange theorem) states, roughly, that for a given planar arc between two endpoints, there is at least one point at which the tangent to the arc is parallel to the secant through its endpoints. It ...
, for fixed and non-zero, \theta,\,\theta^\prime, \,\, \phi,\,\,\phi^\prime can be found in the open interval (0,1) with :\begin w\left(h,\, k\right) &= u\left(h,\, k\right) - u\left(0,\, k\right) = h\, \partial_x u\left(\theta h,\, k\right) \\ &= h\,\left partial_x f\left(a + \theta h,\, b + k\right) - \partial_x f\left(a + \theta h,\, b\right)\right\\ &= hk \, \partial_y \partial_x f\left(a + \theta h,\, b + \theta^\prime k\right) \\ w\left(h,\, k\right) &= v\left(h,\, k\right) - v\left(h,\, 0\right) = k\,\partial_y v\left(h,\, \phi k\right) \\ &= k\left partial_y f\left(a + h,\, b + \phi k\right) - \partial_y f\left(a,\, b + \phi k\right)\right\\ &= hk\, \partial_x\partial_y f \left(a + \phi^\prime h,\, b + \phi k\right). \end Since h,\,k \neq 0, the first equality below can be divided by hk: :\begin hk\,\partial_y\partial_x f\left(a + \theta h,\, b + \theta^\prime k\right) &= hk \, \partial_x\partial_y f\left(a + \phi^\prime h,\, b + \phi k\right), \\ \partial_y\partial_x f\left(a + \theta h,\, b + \theta^\prime k\right) &= \partial_x\partial_y f\left(a + \phi^\prime h,\, b + \phi k\right). \end Letting h,\,k tend to zero in the last equality, the continuity assumptions on \partial_y\partial_x f and \partial_x\partial_y f now imply that : \fracf\left(a,\, b\right) = \fracf\left(a,\, b\right). This account is a straightforward classical method found in many text books, for example in Burkill, Apostol and Rudin. Although the derivation above is elementary, the approach can also be viewed from a more conceptual perspective so that the result becomes more apparent. Indeed the difference operators \Delta^t_x,\,\,\Delta^t_y commute and \Delta^t_x f,\,\,\Delta^t_y f tend to \partial_x f,\,\, \partial_y f as t tends to 0, with a similar statement for second order operators. Here, for z a vector in the plane and u a directional vector, the difference operator is defined by :\Delta^t_u f(z)= . By the
fundamental theorem of calculus The fundamental theorem of calculus is a theorem that links the concept of differentiating a function (calculating its slopes, or rate of change at each time) with the concept of integrating a function (calculating the area under its graph, ...
for C^1 functions f on an open interval I with (a,b) \subset I :\int_a^b f^\prime (x) \, dx = f(b) - f(a). Hence :, f(b) - f(a), \le (b-a)\, \sup_ , f^\prime(c), . This is a generalized version of the
mean value theorem In mathematics, the mean value theorem (or Lagrange theorem) states, roughly, that for a given planar arc between two endpoints, there is at least one point at which the tangent to the arc is parallel to the secant through its endpoints. It ...
. Recall that the elementary discussion on maxima or minima for real-valued functions implies that if f is continuous on ,b/math> and differentiable on (a,b), then there is a point c in (a,b) such that : = f^\prime(c). For vector-valued functions with V a finite-dimensional normed space, there is no analogue of the equality above, indeed it fails. But since \inf f^\prime \le f^\prime(c) \le \sup f^\prime, the inequality above is a useful substitute. Moreover, using the pairing of the dual of V with its dual norm, yields the following inequality: :\, f(b) - f(a)\, \le (b-a)\, \sup_ \, f^\prime(c)\, . These versions of the mean valued theorem are discussed in Rudin, Hörmander and elsewhere. For f a C^2 function on an open set in the plane, define D_1 = \partial_x and D_2 = \partial_y. Furthermore for t \ne 0 set :\Delta_1^t f(x,y) =
(x+t,y)-f(x,y) X, or x, is the twenty-fourth and third-to-last letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is ''"ex"'' (pronounced ) ...
t,\,\,\,\,\,\,\Delta^t_2f(x,y)= (x,y+t) -f(x,y)t. Then for (x_0,y_0) in the open set, the generalized mean value theorem can be applied twice: : \left, \Delta_1^t\Delta_2^t f(x_0,y_0) - D_1 D_2f(x_0,y_0)\\le \sup_ \left, \Delta_1^t D_2 f(x_0,y_0 + ts) -D_1D_2 f(x_0,y_0)\\le \sup_ \left, D_1D_2f(x_0+tr,y_0+ts) - D_1D_2f(x_0,y_0)\. Thus \Delta_1^t\Delta_2^t f(x_0,y_0) tends to D_1 D_2f(x_0,y_0) as t tends to 0. The same argument shows that \Delta_2^t\Delta_1^t f(x_0,y_0) tends to D_2 D_1f(x_0,y_0). Hence, since the difference operators commute, so do the partial differential operators D_1 and D_2, as claimed. Remark. By two applications of the classical mean value theorem, :\Delta_1^t\Delta_2^t f(x_0,y_0)= D_1 D_2 f(x_0+t\theta,y_0 +t\theta^\prime) for some \theta and \theta^\prime in (0,1). Thus the first elementary proof can be reinterpreted using difference operators. Conversely, instead of using the generalized mean value theorem in the second proof, the classical mean valued theorem could be used.


Proof of Clairaut's theorem using iterated integrals

The properties of repeated Riemann integrals of a continuous function on a compact rectangle are easily established. The
uniform continuity In mathematics, a real function f of real numbers is said to be uniformly continuous if there is a positive real number \delta such that function values over any function domain interval of the size \delta are as close to each other as we want. I ...
of implies immediately that the functions g(x)=\int_c^d F(x,y)\, dy and h(y)=\int_a^b F(x,y)\, dx are continuous. It follows that :\int_a^b \int_c^d F(x,y) \, dy\, dx = \int_c^d \int_a^b F(x,y) \, dx \, dy; moreover it is immediate that the iterated integral is positive if is positive. The equality above is a simple case of Fubini's theorem, involving no measure theory. proves it in a straightforward way using Riemann approximating sums corresponding to subdivisions of a rectangle into smaller rectangles. To prove Clairaut's theorem, assume is a differentiable function on an open set , for which the mixed second partial derivatives and exist and are continuous. Using the
fundamental theorem of calculus The fundamental theorem of calculus is a theorem that links the concept of differentiating a function (calculating its slopes, or rate of change at each time) with the concept of integrating a function (calculating the area under its graph, ...
twice, :\int_c^d \int_a^b f_(x,y) \, dx \, dy = \int_c^d f_y(b,y) - f_y(a,y) \, dy = f(b,d)-f(a,d)-f(b,c)+f(a,c). Similarly :\int_a^b \int_c^d f_(x,y) \, dy \, dx = \int_a^b f_x(x,d) - f_x(x,c) \, dx = f(b,d)-f(a,d)-f(b,c)+f(a,c). The two iterated integrals are therefore equal. On the other hand, since is continuous, the second iterated integral can be performed by first integrating over and then afterwards over . But then the iterated integral of on must vanish. However, if the iterated integral of a continuous function function vanishes for all rectangles, then must be identically zero; for otherwise or would be strictly positive at some point and therefore by continuity on a rectangle, which is not possible. Hence must vanish identically, so that everywhere.


Sufficiency of twice-differentiability

A weaker condition than the continuity of second partial derivatives (which is implied by the latter) which suffices to ensure symmetry is that all partial derivatives are themselves differentiable. Another strengthening of the theorem, in which ''existence'' of the permuted mixed partial is asserted, was provided by Peano in a short 1890 note on Mathesis: : ''If f:E \to \mathbb is defined on an open set E \subset \R^2; \partial_1 f(x,\, y) and \partial_f(x,\, y) exist everywhere on E; \partial_f is continuous at \left(x_0,\, y_0\right) \in E, and if \partial_f(x,\, y_0) exists in a neighborhood of x = x_0, then \partial_f exists at \left(x_0,\, y_0\right) and \partial_f\left(x_0,\, y_0\right) = \partial_f\left(x_0,\, y_0\right).''


Distribution theory formulation

The theory of distributions (generalized functions) eliminates analytic problems with the symmetry. The derivative of an integrable function can always be defined as a distribution, and symmetry of mixed partial derivatives always holds as an equality of distributions. The use of formal
integration by parts In calculus, and more generally in mathematical analysis, integration by parts or partial integration is a process that finds the integral of a product of functions in terms of the integral of the product of their derivative and antiderivative. ...
to define differentiation of distributions puts the symmetry question back onto the test functions, which are smooth and certainly satisfy this symmetry. In more detail (where ''f'' is a distribution, written as an operator on test functions, and ''φ'' is a test function), : \left(D_1 D_2 f\right) phi= -\left(D_2f\right)\left _1\phi\right= f\left _2 D_1\phi\right= f\left _1 D_2\phi\right= -\left(D_1 f\right)\left _2\phi\right= \left(D_2 D_1 f\right) phi Another approach, which defines the
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
of a function, is to note that on such transforms partial derivatives become multiplication operators that commute much more obviously.


Requirement of continuity

The symmetry may be broken if the function fails to have differentiable partial derivatives, which is possible if Clairaut's theorem is not satisfied (the second partial derivatives are not continuous). An example of non-symmetry is the function (due to Peano) This can be visualized by the polar form f(r \cos(\theta), r\sin(\theta)) = \frac; it is everywhere continuous, but its derivatives at cannot be computed algebraically. Rather, the limit of difference quotients shows that f_x(0,0) = f_y(0,0) = 0, so the graph z = f(x, y) has a horizontal tangent plane at , and the partial derivatives f_x, f_y exist and are everywhere continuous. However, the second partial derivatives are not continuous at , and the symmetry fails. In fact, along the ''x''-axis the ''y''-derivative is f_y(x,0) = x, and so: : f_(0,0) = \lim_ \frac = 1. In contrast, along the ''y''-axis the ''x''-derivative f_x(0,y) = -y, and so f_(0,0) = -1. That is, f_ \ne f_ at , although the mixed partial derivatives do exist, and at every other point the symmetry does hold. The above function, written in a cylindrical coordinate system, can be expressed as :f(r,\, \theta) = \frac, showing that the function oscillates four times when traveling once around an arbitrarily small loop containing the origin. Intuitively, therefore, the local behavior of the function at (0, 0) cannot be described as a quadratic form, and the Hessian matrix thus fails to be symmetric. In general, the
interchange of limiting operations In mathematics, the study of interchange of limiting operations is one of the major concerns of mathematical analysis, in that two given limiting operations, say ''L'' and ''M'', cannot be ''assumed'' to give the same result when applied in either ...
need not commute. Given two variables near and two limiting processes on :f(h,\, k) - f(h,\, 0) - f(0,\, k) + f(0,\, 0) corresponding to making ''h'' → 0 first, and to making ''k'' → 0 first. It can matter, looking at the first-order terms, which is applied first. This leads to the construction of pathological examples in which second derivatives are non-symmetric. This kind of example belongs to the theory of real analysis where the pointwise value of functions matters. When viewed as a distribution the second partial derivative's values can be changed at an arbitrary set of points as long as this has
Lebesgue measure In measure theory, a branch of mathematics, the Lebesgue measure, named after French mathematician Henri Lebesgue, is the standard way of assigning a measure to subsets of ''n''-dimensional Euclidean space. For ''n'' = 1, 2, or 3, it coincides ...
0. Since in the example the Hessian is symmetric everywhere except , there is no contradiction with the fact that the Hessian, viewed as a Schwartz distribution, is symmetric.


In Lie theory

Consider the first-order differential operators ''D''''i'' to be
infinitesimal operator In mathematics, an infinitesimal transformation is a limiting form of ''small'' transformation. For example one may talk about an infinitesimal rotation of a rigid body, in three-dimensional space. This is conventionally represented by a 3×3 ...
s on
Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are Euclidean sp ...
. That is, ''D''''i'' in a sense generates the
one-parameter group In mathematics, a one-parameter group or one-parameter subgroup usually means a continuous group homomorphism :\varphi : \mathbb \rightarrow G from the real line \mathbb (as an additive group) to some other topological group G. If \varphi is in ...
of
translations Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
parallel to the ''x''''i''-axis. These groups commute with each other, and therefore the infinitesimal generators do also; the Lie bracket : 'D''''i'', ''D''''j''= 0 is this property's reflection. In other words, the Lie derivative of one coordinate with respect to another is zero.


Application to differential forms

The Clairaut-Schwarz theorem is the key fact needed to prove that for every C^\infty (or at least twice differentiable)
differential form In mathematics, differential forms provide a unified approach to define integrands over curves, surfaces, solids, and higher-dimensional manifolds. The modern notion of differential forms was pioneered by Élie Cartan. It has many application ...
\omega\in\Omega^k(M), the second exterior derivative vanishes: d^2\omega := d(d\omega) = 0. This implies that every differentiable exact form (i.e., a form \alpha such that \alpha = d\omega for some form \omega) is closed (i.e., d\alpha = 0), since d\alpha = d(d\omega) = 0. In the middle of the 18th century, the theory of differential forms was first studied in the simplest case of 1-forms in the plane, i.e. A\,dx + B\,dy, where A and B are functions in the plane. The study of 1-forms and the differentials of functions began with Clairaut's papers in 1739 and 1740. At that stage his investigations were interpreted as ways of solving
ordinary differential equation In mathematics, an ordinary differential equation (ODE) is a differential equation whose unknown(s) consists of one (or more) function(s) of one variable and involves the derivatives of those functions. The term ''ordinary'' is used in contras ...
s. Formally Clairaut showed that a 1-form \omega = A \, dx + B \, dy on an open rectangle is closed, i.e. d\omega=0, if and only \omega has the form df for some function f in the disk. The solution for f can be written by Cauchy's integral formula :f(x,y)=\int_^x A(x,y)\, dx + \int_ ^y B(x,y)\, dy; while if \omega= df, the closed property d\omega=0 is the identity \partial_x\partial_y f = \partial_y\partial_x f. (In modern language this is one version of the
Poincaré lemma In mathematics, especially vector calculus and differential topology, a closed form is a differential form ''α'' whose exterior derivative is zero (), and an exact form is a differential form, ''α'', that is the exterior derivative of another ...
.)


Notes


References

* * * * * * * (reprinted 1978) * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *


Further reading

* {{Springer, id=P/p071620, title=Partial derivative Multivariable calculus Generalized functions Symmetry Theorems in analysis