mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

, there are many kinds of inequalities involving

matrices Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the ...

and

linear operator In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pr ...

s on

Hilbert space In mathematics, a Hilbert space is a real number, real or complex number, complex inner product space that is also a complete metric space with respect to the metric induced by the inner product. It generalizes the notion of Euclidean space. The ...

s. This article covers some important operator inequalities connected with traces of matrices.E. Carlen, Trace Inequalities and Quantum Entropy: An Introductory Course, Contemp. Math. 529 (2010) 73–140 B. Simon, Trace Ideals and their Applications, Cambridge Univ. Press, (1979); Second edition. Amer. Math. Soc., Providence, RI, (2005).

Basic definitions

Let

\mathbf_n

denote the space of

Hermitian {{Short description, none Numerous things are named after the French mathematician Charles Hermite (1822–1901): Hermite * Cubic Hermite spline, a type of third-degree spline * Gauss–Hermite quadrature, an extension of Gaussian quadrature me ...

n \times n

matrices,

\mathbf_n^+

denote the set consisting of positive semi-definite

n \times n

Hermitian matrices and

\mathbf_n^

denote the set of positive definite Hermitian matrices. For operators on an infinite dimensional Hilbert space we require that they be

trace class In mathematics, specifically functional analysis, a trace-class operator is a linear operator for which a trace may be defined, such that the trace is a finite number independent of the choice of basis used to compute the trace. This trace of tra ...

and

self-adjoint In mathematics, an element of a *-algebra is called self-adjoint if it is the same as its adjoint (i.e. a = a^*). Definition Let \mathcal be a *-algebra. An element a \in \mathcal is called self-adjoint if The set of self-adjoint elements ...

, in which case similar definitions apply, but we discuss only matrices, for simplicity. For any real-valued function

f

on an interval

I \subseteq \Reals,

one may define a

matrix function In mathematics, every analytic function can be used for defining a matrix function that maps square matrices with complex entries to square matrices of the same size. This is used for defining the exponential of a matrix, which is involved in th ...

f(A)

for any operator

A \in \mathbf_n

with

eigenvalues In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...

\lambda

I

by defining it on the eigenvalues and corresponding

projectors A projector or image projector is an optical device that projects an image (or moving images) onto a surface, commonly a projection screen. Most projectors create an image by shining a light through a small transparent lens, but some newer typ ...

P

f(A) \equiv \sum_j f(\lambda_j)P_j ~,

given the spectral decomposition

A = \sum_j \lambda_j P_j.

Operator monotone

A function

f : I \to \Reals

defined on an interval

I \subseteq \Reals

is said to be operator monotone if for all

n,

and all

A, B \in \mathbf_n

with eigenvalues in

I,

the following holds,

A \geq B \implies f(A) \geq f(B),

where the inequality

A \geq B

means that the operator

A - B \geq 0

is positive semi-definite. One may check that

f(A) = A^2

is, in fact, ''not'' operator monotone!

Operator convex

A function

f : I \to \Reals

is said to be operator convex if for all

n

and all

A, B \in \mathbf_n

with eigenvalues in

I,

and

0 < \lambda < 1

, the following holds

f(\lambda A + (1-\lambda)B) \leq \lambda f(A) + (1 -\lambda)f(B).

Note that the operator

\lambda A + (1-\lambda)B

has eigenvalues in

I,

since

A

and

B

have eigenvalues in

I.

A function

f

is if

-f

is operator convex;=, that is, the inequality above for

f

is reversed.

Joint convexity

A function

g : I \times J \to \Reals,

defined on intervals

I, J \subseteq \Reals

is said to be if for all

n

and all

A_1, A_2 \in \mathbf_n

with eigenvalues in

I

and all

B_1, B_2 \in \mathbf_n

with eigenvalues in

J,

and any

0 \leq \lambda \leq 1

the following holds

g(\lambda A_1 + (1-\lambda) A_2, \lambda B_1 + (1-\lambda) B_2) ~\leq~ \lambda g(A_1, B_1) + (1 -\lambda) g(A_2, B_2).

A function

g

is if −

g

is jointly convex, i.e. the inequality above for

g

is reversed.

Trace function

Given a function

f : \Reals \to \Reals,

the associated trace function on

\mathbf_n

is given by

A \mapsto \operatorname f(A) = \sum_j f(\lambda_j),

where

A

has eigenvalues

\lambda

and

\operatorname

stands for a

trace Trace may refer to: Arts and entertainment Music * ''Trace'' (Son Volt album), 1995 * ''Trace'' (Died Pretty album), 1993 * Trace (band), a Dutch progressive rock band * ''The Trace'' (album), by Nell Other uses in arts and entertainment * ...

of the operator.

Convexity and monotonicity of the trace function

Let

f: \mathbb \rarr \mathbb

be continuous, and let be any

integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...

. Then, if

t\mapsto f(t)

is monotone increasing, so is

A \mapsto \operatorname f(A)

on H_''n''. Likewise, if

t \mapsto f(t)

convex Convex or convexity may refer to: Science and technology * Convex lens, in optics Mathematics * Convex set, containing the whole line segment that joins points ** Convex polygon, a polygon which encloses a convex set of points ** Convex polytop ...

, so is

A \mapsto \operatorname f(A)

on H_''n'', and it is strictly convex if is strictly convex. See proof and discussion in, for example.

Löwner–Heinz theorem

For

-1\leq p \leq 0

, the function

f(t) = -t^p

is operator monotone and operator concave. For

0 \leq p \leq 1

, the function

f(t) = t^p

is operator monotone and operator concave. For

1 \leq p \leq 2

, the function

f(t) = t^p

is operator convex. Furthermore, :

f(t) = \log(t)

is operator concave and operator monotone, while :

f(t) = t \log(t)

is operator convex. The original proof of this theorem is due to K. Löwner who gave a necessary and sufficient condition for to be operator monotone. An

elementary proof In mathematics, an elementary proof is a mathematical proof that only uses basic techniques. More specifically, the term is used in number theory to refer to proofs that make no use of complex analysis. Historically, it was once thought that certain ...

of the theorem is discussed in and a more general version of it in.

Klein's inequality

For all Hermitian × matrices and and all differentiable

convex function In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of a function, graph of the function lies above or on the graph between the two points. Equivalently, a function is conve ...

f: \mathbb \rarr \mathbb

with

derivative In mathematics, the derivative is a fundamental tool that quantifies the sensitivity to change of a function's output with respect to its input. The derivative of a function of a single variable at a chosen input value, when it exists, is t ...

, or for all positive-definite Hermitian × matrices and , and all differentiable convex functions :(0,∞) →

\mathbb

, the following inequality holds, In either case, if is strictly convex, equality holds if and only if = . A popular choice in applications is , see below.

Proof

Let

C=A-B

so that, for

t\in (0,1)

, :

B + tC = (1 -t)B + tA

, varies from

B

A

. Define :

F(t) = \operatorname (B + tC) /math>.
By convexity and monotonicity of trace functions, F(t) is convex, and so for all t\in (0,1),
: F(0) + t(F(1)-F(0))\geq F(t),
which is,
: F(1) - F(0) \geq \frac,
and, in fact, the right hand side is monotone decreasing in t .

Taking the limit t\to 0 yields,
: F(1) - F(0) \geq  F'(0),
which with rearrangement and substitution is Klein's inequality:
: \mathrm (A)-f(B)-(A-B)f'(B) \geq 0 Note that if f(t) is strictly convex and C\neq 0, then F(t) is strictly convex. The final assertion follows from this and the fact that \tfrac is monotone decreasing in t .

Golden–Thompson inequality

In 1965, S. Golden and C.J. Thompson independently discovered that For any matrices

A, B\in\mathbf_n

, :

\operatorname e^\leq \operatorname e^A e^B.

This inequality can be generalized for three operators: for non-negative operators

A, B, C\in\mathbf_n^+

, :

\operatorname e^\leq \int_0^\infty \operatorname A(B+t)^C(B+t)^\,\operatornamet.

Peierls–Bogoliubov inequality

Let

R, F\in \mathbf_n

be such that Tr e^''R'' = 1. Defining , we have :

\operatorname e^F e^R \geq \operatorname e^\geq e^g.

The proof of this inequality follows from the above combined with Klein's inequality. Take .D. Ruelle, Statistical Mechanics: Rigorous Results, World Scient. (1969).

Gibbs variational principle

Let

H

be a self-adjoint operator such that

e^

. Then for any

\gamma\geq 0

with

\operatorname\gamma=1,

\operatorname\gamma H+\operatorname\gamma\ln\gamma\geq -\ln \operatorname e^,

with equality if and only if

\gamma=\exp(-H)/\operatorname \exp(-H).

Lieb's concavity theorem

The following theorem was proved by E. H. Lieb in. It proves and generalizes a conjecture of E. P. Wigner, M. M. Yanase, and

Freeman Dyson Freeman John Dyson (15 December 1923 – 28 February 2020) was a British-American theoretical physics, theoretical physicist and mathematician known for his works in quantum field theory, astrophysics, random matrix, random matrices, math ...

. Six years later other proofs were given by T. Ando and B. Simon, and several more have been given since then. For all

m\times n

matrices

K

, and all

q

and

r

such that

0 \leq q\leq 1

and

0\leq r \leq 1

, with

q + r \leq 1

the real valued map on

\mathbf^+_m \times \mathbf^+_n

given by :

F(A,B,K) = \operatorname(K^*A^qKB^r)

* is jointly concave in

(A,B)

* is convex in

K

. Here

K^*

stands for the

adjoint operator In mathematics, specifically in operator theory, each linear operator A on an inner product space defines a Hermitian adjoint (or adjoint) operator A^* on that space according to the rule :\langle Ax,y \rangle = \langle x,A^*y \rangle, where \l ...

K.

Lieb's theorem

For a fixed Hermitian matrix

L\in\mathbf_n

, the function :

f(A)=\operatorname \exp\

is concave on

\mathbf_n^

. The theorem and proof are due to E. H. Lieb, Thm 6, where he obtains this theorem as a corollary of Lieb's concavity Theorem. The most direct proof is due to H. Epstein; see M.B. Ruskai papers, for a review of this argument.

Ando's convexity theorem

T. Ando's proof of Lieb's concavity theorem led to the following significant complement to it: For all

m \times n

matrices

K

, and all

1 \leq q \leq 2

and

0 \leq r \leq 1

with

q-r \geq 1

, the real valued map on

\mathbf^_m \times \mathbf^_n

given by :

(A,B) \mapsto \operatorname(K^*A^qKB^)

is convex.

Joint convexity of relative entropy

For two operators

A, B\in\mathbf^_n

define the following map :

R(A\parallel B):= \operatorname(A\log A) - \operatorname(A\log B).

For

density matrices In quantum mechanics, a density matrix (or density operator) is a matrix used in calculating the probabilities of the outcomes of measurements performed on physical systems. It is a generalization of the state vectors or wavefunctions: while th ...

\rho

and

\sigma

, the map

R(\rho\parallel\sigma)=S(\rho\parallel\sigma)

is the Umegaki's quantum relative entropy. Note that the non-negativity of

R(A\parallel B)

follows from Klein's inequality with

f(t)=t\log t

Statement

The map

R(A\parallel B): \mathbf^_n \times \mathbf^_n \rightarrow \mathbf

is jointly convex.

Proof

For all

0 < p < 1

(A,B) \mapsto \operatorname(B^A^p)

is jointly concave, by Lieb's concavity theorem, and thus :

(A,B)\mapsto \frac(\operatorname(B^A^p)-\operatornameA)

is convex. But :

\lim_\frac(\operatorname(B^A^p)-\operatornameA)=R(A\parallel B),

and convexity is preserved in the limit. The proof is due to G. Lindblad.

Jensen's operator and trace inequalities

The operator version of

Jensen's inequality In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier p ...

is due to C. Davis.C. Davis, A Schwarz inequality for convex operator functions, Proc. Amer. Math. Soc. 8, 42–44, (1957). A continuous, real function

f

on an interval

I

satisfies Jensen's Operator Inequality if the following holds :

f\left(\sum_kA_k^*X_kA_k\right)\leq\sum_k A_k^*f(X_k)A_k,

for operators

\_k

with

\sum_k A^*_kA_k=1

and for

self-adjoint operator In mathematics, a self-adjoint operator on a complex vector space ''V'' with inner product \langle\cdot,\cdot\rangle is a linear map ''A'' (from ''V'' to itself) that is its own adjoint. That is, \langle Ax,y \rangle = \langle x,Ay \rangle for al ...

\_k

with

spectrum A spectrum (: spectra or spectrums) is a set of related ideas, objects, or properties whose features overlap such that they blend to form a continuum. The word ''spectrum'' was first used scientifically in optics to describe the rainbow of co ...

I

. See, for the proof of the following two theorems.

Jensen's trace inequality

Let be a

continuous function In mathematics, a continuous function is a function such that a small variation of the argument induces a small variation of the value of the function. This implies there are no abrupt changes in value, known as '' discontinuities''. More preci ...

defined on an interval and let and be natural numbers. If is convex, we then have the inequality :

\operatorname\Bigl(f\Bigl(\sum_^nA_k^*X_kA_k\Bigr)\Bigr)\leq \operatorname\Bigl(\sum_^n A_k^*f(X_k)A_k\Bigr),

for all (₁, ... , _''n'') self-adjoint × matrices with spectra contained in and all (₁, ... , _''n'') of × matrices with :

\sum_^nA_k^*A_k=1.

Conversely, if the above inequality is satisfied for some and , where > 1, then is convex.

Jensen's operator inequality

For a continuous function

f

defined on an interval

I

the following conditions are equivalent: *

f

is operator convex. * For each natural number

n

we have the inequality :

f\Bigl(\sum_^nA_k^*X_kA_k\Bigr)\leq\sum_^n A_k^*f(X_k)A_k,

for all

(X_1, \ldots , X_n)

bounded, self-adjoint operators on an arbitrary

\mathcal

with spectra contained in

I

and all

(A_1, \ldots , A_n)

\mathcal

with

\sum_^n A^*_kA_k=1.

f(V^*XV) \leq V^*f(X)V

for each isometry

V

on an infinite-dimensional Hilbert space

\mathcal

and every self-adjoint operator

X

with spectrum in

I

. *

Pf(PXP + \lambda(1 -P))P \leq Pf(X)P

for each projection

P

on an infinite-dimensional Hilbert space

\mathcal

, every self-adjoint operator

X

with spectrum in

I

and every

\lambda

I

Araki–Lieb–Thirring inequality

E. H. Lieb and W. E. Thirring proved the following inequality in 1976: For any

A \geq 0,

B \geq 0

and

r \geq 1,

\operatorname ((BAB)^r) ~\leq~ \operatorname (B^r A^r B^r).

In 1990 H. Araki generalized the above inequality to the following one: For any

A \geq 0,

B \geq 0

and

q \geq 0,

\operatorname((BAB)^) ~\leq~ \operatorname((B^r A^r B^r)^q),

for

r \geq 1,

and

\operatorname((B^r A^r B^r)^q) ~\leq~ \operatorname((BAB)^),

for

0 \leq r \leq 1.

There are several other inequalities close to the Lieb–Thirring inequality, such as the following: for any

A \geq 0,

B \geq 0

and

\alpha \in

, 1 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...

\operatorname (B A^\alpha B B A^ B) ~\leq~ \operatorname (B^2 A B^2),

and even more generally: for any

A \geq 0,

B \geq 0,

r \geq 1/2

and

c \geq 0,

\operatorname((B A B^ A B)^r) ~\leq~ \operatorname((B^ A^2 B^)^r).

The above inequality generalizes the previous one, as can be seen by exchanging

A

B^2

and

B

A^

with

\alpha = 2 c / (2 c + 2)

and using the cyclicity of the trace, leading to

\operatorname((B A^\alpha B B A^ B)^r) ~\leq~ \operatorname((B^2 A B^2)^r).

Additionally, building upon the Lieb-Thirring inequality the following inequality was derived: For any

A,B\in \mathbf_n, T\in \mathbb^

and all

1\leq p,q\leq \infty

with

1/p+1/q = 1

, it holds that

, \operatorname(TAT^*B),  ~\leq~ \operatorname(T^*T, A, ^p)^\frac\operatorname(TT^*, B, ^q)^\frac.

Effros's theorem and its extension

E. Effros in proved the following theorem. If

f(x)

is an operator convex function, and

L

and

R

are commuting bounded linear operators, i.e. the commutator

,R LR-RL=0

, the ''perspective'' :

g(L, R):=f(LR^)R

is jointly convex, i.e. if

L=\lambda L_1+(1-\lambda)L_2

and

R=\lambda R_1+(1-\lambda)R_2

with

_i, R_i 0

(i=1,2),

0\leq\lambda\leq 1

, :

g(L,R)\leq \lambda g(L_1,R_1)+(1-\lambda)g(L_2,R_2).

Ebadian et al. later extended the inequality to the case where

L

and

R

do not commute .

Von Neumann's trace inequality and related results

, named after its originator

John von Neumann John von Neumann ( ; ; December 28, 1903 – February 8, 1957) was a Hungarian and American mathematician, physicist, computer scientist and engineer. Von Neumann had perhaps the widest coverage of any mathematician of his time, in ...

, states that for any

n \times n

complex matrices

A

and

B

with singular values

\alpha_1 \geq \alpha_2 \geq \cdots \geq \alpha_n

and

\beta_1 \geq \beta_2 \geq \cdots \geq \beta_n

respectively,

, \operatorname(A B),  ~\leq~ \sum_^n \alpha_i \beta_i\,,

with equality if and only if

A

and

B^

share singular vectors. A simple corollary to this is the following result: For

n \times n

positive semi-definite complex matrices

A

and

B

where now the

eigenvalue In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...

s are sorted decreasingly (

a_1 \geq a_2 \geq \cdots \geq a_n

and

b_1 \geq b_2 \geq \cdots \geq b_n,

respectively),

\sum_^n a_i b_ ~\leq~ \operatorname(A B) ~\leq~ \sum_^n a_i b_i\,.

References

{{reflist
Scholarpedia
primary source. Operator theory Matrix theory Inequalities (mathematics)

Basic definitions

Operator monotone

Operator convex

Joint convexity

Trace function

Convexity and monotonicity of the trace function

Löwner–Heinz theorem

Klein's inequality

Proof

Golden–Thompson inequality

Peierls–Bogoliubov inequality

Gibbs variational principle

Lieb's concavity theorem

Lieb's theorem

Ando's convexity theorem

Joint convexity of relative entropy

Statement

Proof

Jensen's operator and trace inequalities

Jensen's trace inequality

Jensen's operator inequality

Araki–Lieb–Thirring inequality

Effros's theorem and its extension

Von Neumann's trace inequality and related results

See also

References