Derivation Of The Conjugate Gradient Method
   HOME
*





Derivation Of The Conjugate Gradient Method
In numerical linear algebra, the conjugate gradient method is an iterative method for numerically solving the linear system :\boldsymbol=\boldsymbol where \boldsymbol is symmetric positive-definite. The conjugate gradient method can be derived from several different perspectives, including specialization of the conjugate direction methodConjugate Direction Methods http://user.it.uu.se/~matsh/opt/f8/node5.html for optimization, and variation of the Arnoldi/ Lanczos iteration for eigenvalue problems. The intent of this article is to document the important steps in these derivations. Derivation from the conjugate direction method The conjugate gradient method can be seen as a special case of the conjugate direction method applied to minimization of the quadratic function :f(\boldsymbol)=\boldsymbol^\mathrm\boldsymbol\boldsymbol-2\boldsymbol^\mathrm\boldsymbol\text The conjugate direction method In the conjugate direction method for minimizing :f(\boldsymbol)=\boldsymbol^ ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Numerical Linear Algebra
Numerical linear algebra, sometimes called applied linear algebra, is the study of how matrix operations can be used to create computer algorithms which efficiently and accurately provide approximate answers to questions in continuous mathematics. It is a subfield of numerical analysis, and a type of linear algebra. Computers use floating-point arithmetic and cannot exactly represent irrational data, so when a computer algorithm is applied to a matrix of data, it can sometimes increase the difference between a number stored in the computer and the true number that it is an approximation of. Numerical linear algebra uses properties of vectors and matrices to develop computer algorithms that minimize the error introduced by the computer, and is also concerned with ensuring that the algorithm is as efficient as possible. Numerical linear algebra aims to solve problems of continuous mathematics using finite precision computers, so its applications to the natural and social sciences ar ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Gaussian Elimination
In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of operations performed on the corresponding matrix of coefficients. This method can also be used to compute the rank of a matrix, the determinant of a square matrix, and the inverse of an invertible matrix. The method is named after Carl Friedrich Gauss (1777–1855) although some special cases of the method—albeit presented without proof—were known to Chinese mathematicians as early as circa 179 AD. To perform row reduction on a matrix, one uses a sequence of elementary row operations to modify the matrix until the lower left-hand corner of the matrix is filled with zeros, as much as possible. There are three types of elementary row operations: * Swapping two rows, * Multiplying a row by a nonzero number, * Adding a multiple of one row to another row. (subtraction can be achieved by multiplying one row with -1 and adding ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Optimization Algorithms And Methods
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems of sorts arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries. In the more general approach, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the value of the function. The generalization of optimization theory and techniques to other formulations constitutes a large area of applied mathematics. More generally, optimization includes finding "best available" values of some objective function given a define ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Numerical Linear Algebra
Numerical linear algebra, sometimes called applied linear algebra, is the study of how matrix operations can be used to create computer algorithms which efficiently and accurately provide approximate answers to questions in continuous mathematics. It is a subfield of numerical analysis, and a type of linear algebra. Computers use floating-point arithmetic and cannot exactly represent irrational data, so when a computer algorithm is applied to a matrix of data, it can sometimes increase the difference between a number stored in the computer and the true number that it is an approximation of. Numerical linear algebra uses properties of vectors and matrices to develop computer algorithms that minimize the error introduced by the computer, and is also concerned with ensuring that the algorithm is as efficient as possible. Numerical linear algebra aims to solve problems of continuous mathematics using finite precision computers, so its applications to the natural and social sciences ar ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Partial Pivoting
The pivot or pivot element is the element of a matrix, or an array, which is selected first by an algorithm (e.g. Gaussian elimination, simplex algorithm, etc.), to do certain calculations. In the case of matrix algorithms, a pivot entry is usually required to be at least distinct from zero, and often distant from it; in this case finding this element is called pivoting. Pivoting may be followed by an interchange of rows or columns to bring the pivot to a fixed position and allow the algorithm to proceed successfully, and possibly to reduce round-off error. It is often used for verifying row echelon form. Pivoting might be thought of as swapping or sorting rows or columns in a matrix, and thus it can be represented as multiplication by permutation matrices. However, algorithms rarely move the matrix elements because this would cost too much time; instead, they just keep track of the permutations. Overall, pivoting adds more operations to the computational cost of an algorithm. T ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

LU Factorization
In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix (see matrix decomposition). The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish mathematician Tadeusz Banachiewicz in 1938. Definitions Let ''A'' be a square matrix. An LU factorization refers to the factorization of ''A'', with proper row and/or column orderings or permutations, into two factors – a lower triangular matrix ''L'' and an upper triangular matrix ''U'': : A = LU. In the lower triangular matrix all elements above the diagonal are zero, in the upper triangular matrix, all the el ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Upper Hessenberg Matrix
In linear algebra, a Hessenberg matrix is a special kind of square matrix, one that is "almost" triangular. To be exact, an upper Hessenberg matrix has zero entries below the first subdiagonal, and a lower Hessenberg matrix has zero entries above the first superdiagonal. They are named after Karl Hessenberg. Definitions Upper Hessenberg matrix A square n \times n matrix A is said to be in upper Hessenberg form or to be an upper Hessenberg matrix if a_=0 for all i,j with i > j+1. An upper Hessenberg matrix is called unreduced if all subdiagonal entries are nonzero, i.e. if a_ \neq 0 for all i \in \. Lower Hessenberg matrix A square n \times n matrix A is said to be in lower Hessenberg form or to be a lower Hessenberg matrix if its transpose is an upper Hessenberg matrix or equivalently if a_=0 for all i,j with j > i+1. A lower Hessenberg matrix is called unreduced if all superdiagonal entries are nonzero, i.e. if a_ \neq 0 for all i \in \. Examples Consider the following matri ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Krylov Subspace
In linear algebra, the order-''r'' Krylov subspace generated by an ''n''-by-''n'' matrix ''A'' and a vector ''b'' of dimension ''n'' is the linear subspace spanned by the images of ''b'' under the first ''r'' powers of ''A'' (starting from A^0=I), that is, :\mathcal_r(A,b) = \operatorname \, \. Background The concept is named after Russian applied mathematician and naval engineer Alexei Krylov, who published a paper about it in 1931. Properties * \mathcal_r(A,b),A\mathcal_r(A,b)\subset \mathcal_(A,b). * Vectors \ are linearly independent until r, where p(A) is the minimal polynomial of A. Furthermore, there exists a b such that r_0 = \deg (A)/math>. * \mathcal_r(A,b) is a cyclic submodule generated by b of the torsion k /math>-module (k^n)^A, where k^n is the linear space on k. * k^n can be decomposed as the direct sum of Krylov subspaces. Use Krylov subspaces are used in algorithms for finding approximate solutions to high-dimensional linear algebra problems. Many linear dyn ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Orthonormal
In linear algebra, two vectors in an inner product space are orthonormal if they are orthogonal (or perpendicular along a line) unit vectors. A set of vectors form an orthonormal set if all vectors in the set are mutually orthogonal and all of unit length. An orthonormal set which forms a basis is called an orthonormal basis. Intuitive overview The construction of orthogonality of vectors is motivated by a desire to extend the intuitive notion of perpendicular vectors to higher-dimensional spaces. In the Cartesian plane, two vectors are said to be ''perpendicular'' if the angle between them is 90° (i.e. if they form a right angle). This definition can be formalized in Cartesian space by defining the dot product and specifying that two vectors in the plane are orthogonal if their dot product is zero. Similarly, the construction of the norm of a vector is motivated by a desire to extend the intuitive notion of the length of a vector to higher-dimensional spaces. In Cartesian ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Eigenvalue
In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted by \lambda, is the factor by which the eigenvector is scaled. Geometrically, an eigenvector, corresponding to a real nonzero eigenvalue, points in a direction in which it is stretched by the transformation and the eigenvalue is the factor by which it is stretched. If the eigenvalue is negative, the direction is reversed. Loosely speaking, in a multidimensional vector space, the eigenvector is not rotated. Formal definition If is a linear transformation from a vector space over a field into itself and is a nonzero vector in , then is an eigenvector of if is a scalar multiple of . This can be written as T(\mathbf) = \lambda \mathbf, where is a scalar in , known as the eigenvalue, characteristic value, or characteristic root ass ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Conjugate Gradient Method
In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-definite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems. The conjugate gradient method can also be used to solve unconstrained optimization problems such as energy minimization. It is commonly attributed to Magnus Hestenes and Eduard Stiefel, who programmed it on the Z4, and extensively researched it. The biconjugate gradient method provides a generalization to non-symmetric matrices. Various nonlinear conjugate gradient methods seek minima of nonlinear optimization problems. Description of the problem addressed by co ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]