HOME

TheInfoList



OR:

The ensemble Kalman filter (EnKF) is a
recursive filter In signal processing, a recursive filter is a type of filter which re-uses one or more of its outputs as an input. This feedback typically results in an unending impulse response (commonly referred to as ''infinite impulse response'' (IIR)), chara ...
suitable for problems with a large number of variables, such as
discretization In applied mathematics, discretization is the process of transferring continuous functions, models, variables, and equations into discrete counterparts. This process is usually carried out as a first step toward making them suitable for numerical ...
s of
partial differential equation In mathematics, a partial differential equation (PDE) is an equation which imposes relations between the various partial derivatives of a Multivariable calculus, multivariable function. The function is often thought of as an "unknown" to be sol ...
s in geophysical models. The EnKF originated as a version of the
Kalman filter For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimat ...
for large problems (essentially, the
covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
is replaced by the sample covariance), and it is now an important
data assimilation Data assimilation is a mathematical discipline that seeks to optimally combine theory (usually in the form of a numerical model) with observations. There may be a number of different goals sought – for example, to determine the optimal state es ...
component of
ensemble forecasting Ensemble forecasting is a method used in or within numerical weather prediction. Instead of making a single forecast of the most likely weather, a set (or ensemble) of forecasts is produced. This set of forecasts aims to give an indication of the ...
. EnKF is related to the
particle filter Particle filters, or sequential Monte Carlo methods, are a set of Monte Carlo algorithms used to solve filtering problems arising in signal processing and Bayesian statistical inference. The filtering problem consists of estimating the i ...
(in this context, a particle is the same thing as an ensemble member) but the EnKF makes the assumption that all probability distributions involved are
Gaussian Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below. There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...
; when it is applicable, it is much more efficient than the
particle filter Particle filters, or sequential Monte Carlo methods, are a set of Monte Carlo algorithms used to solve filtering problems arising in signal processing and Bayesian statistical inference. The filtering problem consists of estimating the i ...
.


Introduction

The ensemble Kalman filter (EnKF) is a
Monte Carlo Monte Carlo (; ; french: Monte-Carlo , or colloquially ''Monte-Carl'' ; lij, Munte Carlu ; ) is officially an administrative area of the Principality of Monaco, specifically the ward of Monte Carlo/Spélugues, where the Monte Carlo Casino is ...
implementation of the Bayesian update problem: given a
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
(PDF) of the state of the modeled system (the ''
prior Prior (or prioress) is an ecclesiastical title for a superior in some religious orders. The word is derived from the Latin for "earlier" or "first". Its earlier generic usage referred to any monastic superior. In abbeys, a prior would be l ...
'', called often the forecast in geosciences) and the data likelihood, Bayes' theorem is used to obtain the PDF after the data likelihood has been taken into account (the '' posterior'', often called the analysis). This is called a Bayesian update. The Bayesian update is combined with advancing the model in time, incorporating new data from time to time. The original
Kalman filter For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimat ...
, introduced in 1960, assumes that all PDFs are
Gaussian Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below. There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...
(the Gaussian assumption) and provides algebraic formulas for the change of the
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithme ...
and the
covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
by the Bayesian update, as well as a formula for advancing the mean and covariance in time provided the system is linear. However, maintaining the covariance matrix is not feasible computationally for high-dimensional systems. For this reason, EnKFs were developed. EnKFs represent the distribution of the system state using a collection of state vectors, called an
ensemble Ensemble may refer to: Art * Architectural ensemble * ''Ensemble'' (album), Kendji Girac 2015 album * Ensemble (band), a project of Olivier Alary * Ensemble cast (drama, comedy) * Ensemble (musical theatre), also known as the chorus * ''En ...
, and replace the covariance matrix by the sample covariance computed from the ensemble. The ensemble is operated with as if it were a
random sample In statistics, quality assurance, and survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians atte ...
, but the ensemble members are really not
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
, as they all share the EnKF. One advantage of EnKFs is that advancing the PDF in time is achieved by simply advancing each member of the ensemble.For a survey of EnKF and related data assimilation techniques, see


Derivation


Kalman filter

Let \mathbf denote the n-dimensional state vector of a model, and assume that it has Gaussian probability distribution with mean \mathbf and covariance Q, i.e., its PDF is : p(\mathbf)\propto\exp\left( -\frac(\mathbf-\mathbf)^Q^(\mathbf-\mathbf)\right) . Here and below, \propto means proportional; a PDF is always scaled so that its integral over the whole space is one. This p(\mathbf), called the ''
prior Prior (or prioress) is an ecclesiastical title for a superior in some religious orders. The word is derived from the Latin for "earlier" or "first". Its earlier generic usage referred to any monastic superior. In abbeys, a prior would be l ...
'', was evolved in time by running the model and now is to be updated to account for new data. It is natural to assume that the error distribution of the data is known; data have to come with an error estimate, otherwise they are meaningless. Here, the data \mathbf is assumed to have Gaussian PDF with covariance R and mean H\mathbf, where H is the so-called observation matrix. The covariance matrix R describes the estimate of the error of the data; if the random errors in the entries of the data vector \mathbf are independent, R is diagonal and its diagonal entries are the squares of the
standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
(“error size”) of the error of the corresponding entries of the data vector \mathbf. The value H\mathbf is what the value of the data would be for the state \mathbf in the absence of data errors. Then the probability density p(\mathbf, \mathbf) of the data \mathbf conditional of the system state \mathbf, called the data likelihood, is : p\left( \mathbf, \mathbf\right) \propto\exp\left( -\frac(\mathbf-H\mathbf)^R^(\mathbf-H\mathbf)\right) . The PDF of the state and the data likelihood are combined to give the new probability density of the system state \mathbf conditional on the value of the data \mathbf (the '' posterior'') by the
Bayes theorem In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
, : p\left( \mathbf, \mathbf\right) \propto p\left( \mathbf, \mathbf\right) p(\mathbf). The data \mathbf is fixed once it is received, so denote the posterior state by \mathbf instead of \mathbf, \mathbf and the posterior PDF by p\left( \mathbf\right) . It can be shown by algebraic manipulations that the posterior PDF is also Gaussian, : p\left( \mathbf\right) \propto\exp\left( -\frac(\mathbf-\mathbf)^\hat^(\mathbf-\mathbf)\right) , with the posterior mean \mathbf and covariance \hat given by the Kalman update formulas : \mathbf=\mathbf+K\left( \mathbf-H\mathbf\right) ,\quad\hat=\left( I-KH\right) Q, where : K=QH^\left( HQH^+R\right) ^ is the so-called Kalman gain matrix.


Ensemble Kalman Filter

The EnKF is a Monte Carlo approximation of the Kalman filter, which avoids evolving the covariance matrix of the PDF of the state vector \mathbf. Instead, the PDF is represented by an ensemble : X=\left \mathbf_,\ldots,\mathbf_\right =\left \mathbf_\right X is an n\times N matrix whose columns are the ensemble members, and it is called the ''prior ensemble''. Ideally, ensemble members would form a
sample Sample or samples may refer to: Base meaning * Sample (statistics), a subset of a population – complete data set * Sample (signal), a digital discrete sample of a continuous analog signal * Sample (material), a specimen or small quantity of s ...
from the prior distribution. However, the ensemble members are not in general
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
except in the initial ensemble, since every EnKF step ties them together. They are deemed to be approximately independent, and all calculations proceed as if they actually were independent. Replicate the data \mathbf into an m\times N matrix : D=\left \mathbf_,\ldots,\mathbf_\right =\left \mathbf_\right \quad \mathbf_=\mathbf+\mathbf, \quad \mathbf \sim N(0,R), so that each column \mathbf_ consists of the data vector \mathbf plus a random vector from the m-dimensional normal distribution N(0,R). If, in addition, the columns of X are a sample from the
prior probability In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into ...
distribution, then the columns of : \hat=X+K(D-HX) form a sample from the
posterior probability The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior p ...
distribution. To see this in the scalar case with H=1: Let x_i = \mu + \xi_i, \; \xi_i \sim N(0, \sigma_x^2), and d_i = d + \epsilon_i, \; \epsilon_i \sim N(0, \sigma_d^2). Then :\hat_i = \left(\frac \mu + \frac d \right)+ \left(\frac \xi_i + \frac \epsilon_i \right) . The first sum is the posterior mean, and the second sum, in view of the independence, has a variance :\left(\frac\right)^2 \sigma_x^2 + \left(\frac\right)^2 \sigma_d^2 = \frac, which is the posterior variance. The EnKF is now obtained simply by replacing the state covariance Q in Kalman gain matrix K by the sample covariance C computed from the ensemble members (called the ''ensemble covariance''), that is: K=CH^\left( HCH^+R\right) ^


Implementation


Basic formulation

Here we follow. Suppose the ensemble matrix X and the data matrix D are as above. The ensemble mean and the covariance are : E\left( X\right) =\frac\sum_^\mathbf_,\quad C=\frac, where : A=X-E\left( X\right) \mathbf_ =X-\frac\left( X\mathbf_\right) \mathbf_, and \mathbf denotes the matrix of all ones of the indicated size. The posterior ensemble X^ is then given by : X^=X+CH^\left( HCH^+R\right) ^(D-HX), where the perturbed data matrix D is as above. Note that since R is a covariance matrix, it is always positive semidefinite and usually
positive definite In mathematics, positive definiteness is a property of any object to which a bilinear form or a sesquilinear form may be naturally associated, which is positive-definite. See, in particular: * Positive-definite bilinear form * Positive-definite f ...
, so the inverse above exists and the formula can be implemented by the
Cholesky decomposition In linear algebra, the Cholesky decomposition or Cholesky factorization (pronounced ) is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for effici ...
. In, R is replaced by the sample covariance \tilde \tilde^/\left( N-1\right) where \tilde = D - \frac d \, \mathbf_and the inverse is replaced by a
pseudoinverse In mathematics, and in particular, algebra, a generalized inverse (or, g-inverse) of an element ''x'' is an element ''y'' that has some properties of an inverse element but not necessarily all of them. The purpose of constructing a generalized inv ...
, computed using the
singular-value decomposition In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is relate ...
(SVD) . Since these formulas are matrix operations with dominant Level 3 operations, they are suitable for efficient implementation using software packages such as
LAPACK LAPACK ("Linear Algebra Package") is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It als ...
(on serial and
shared memory In computer science, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between progr ...
computers) and
ScaLAPACK The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. It is currently written in a Single-Program-Multiple-Data style using explicit message passing for interproces ...
(on
distributed memory In computer science, distributed memory refers to a multiprocessor computer system in which each processor has its own private memory. Computational tasks can only operate on local data, and if remote data are required, the computational task mu ...
computers). Instead of computing the inverse of a matrix and multiplying by it, it is much better (several times cheaper and also more accurate) to compute the
Cholesky decomposition In linear algebra, the Cholesky decomposition or Cholesky factorization (pronounced ) is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for effici ...
of the matrix and treat the multiplication by the inverse as solution of a linear system with many simultaneous right-hand sides.


Observation matrix-free implementation

Since we have replaced the covariance matrix with ensemble covariance, this leads to a simpler formula where ensemble observations are directly used without explicitly specifying the matrix H. More specifically, define a function h(\mathbf) of the form : h(\mathbf)=H\mathbf. The function h is called the '' observation function'' or, in the
inverse problem An inverse problem in science is the process of calculating from a set of observations the causal factors that produced them: for example, calculating an image in X-ray computed tomography, source reconstruction in acoustics, or calculating the ...
s context, the '' forward operator''. The value of h(\mathbf) is what the value of the data would be for the state \mathbf assuming the measurement is exact. Then the posterior ensemble can be rewritten as : X^=X+\fracA\left( HA\right) ^P^(D-HX) where : HA=HX-\frac\left( \left( HX\right) \mathbf_\right) \mathbf_, and : P=\fracHA\left( HA\right) ^+R, with :\left HA\right _ =H\mathbf_-H\frac\sum_^\mathbf_\ =h\left( \mathbf_\right) -\frac\sum_^h\left( \mathbf_\right) . Consequently, the ensemble update can be computed by evaluating the observation function h on each ensemble member once and the matrix H does not need to be known explicitly. This formula holds also for an observation function h(\mathbf)=H\mathbf with a fixed offset \mathbf, which also does not need to be known explicitly. The above formula has been commonly used for a nonlinear observation function h, such as the position of a
hurricane A tropical cyclone is a rapidly rotating storm system characterized by a low-pressure center, a closed low-level atmospheric circulation, strong winds, and a spiral arrangement of thunderstorms that produce heavy rain and squalls. Depend ...
vortex In fluid dynamics, a vortex ( : vortices or vortexes) is a region in a fluid in which the flow revolves around an axis line, which may be straight or curved. Vortices form in stirred fluids, and may be observed in smoke rings, whirlpools in th ...
. In that case, the observation function is essentially approximated by a linear function from its values at ensemble members.


Implementation for a large number of data points

For a large number m of data points, the multiplication by P^ becomes a bottleneck. The following alternative formula is advantageous when the number of data points m is large (such as when assimilating gridded or pixel data) and the data error
covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
R is diagonal (which is the case when the data errors are uncorrelated), or cheap to decompose (such as banded due to limited covariance distance). Using the Sherman–Morrison–Woodbury formula : (R+UV^)^=R^-R^U(I+V^R^U)^V^R^, with : U=\fracHA,\quad V=HA, gives :\begin P^ & =\left( R+\fracHA\left( HA\right) ^\right) ^\ = \\ & =R^\left I-\frac\left( HA\right) \left( I+\left( HA\right) ^R^\frac\left( HA\right) \right) ^\left( HA\right) ^R^\right , \end which requires only the solution of systems with the matrix R (assumed to be cheap) and of a system of size N with m right-hand sides. See for operation counts.


Further extensions

The EnKF version described here involves randomization of data. For filters without randomization of data, see. Since the ensemble covariance is
rank deficient In linear algebra, the rank of a matrix is the dimension of the vector space generated (or spanned) by its columns. p. 48, § 1.16 This corresponds to the maximal number of linearly independent columns of . This, in turn, is identical to the dime ...
(there are many more state variables, typically millions, than the ensemble members, typically less than a hundred), it has large terms for pairs of points that are spatially distant. Since in reality the values of physical fields at distant locations are not that much
correlated In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
, the covariance matrix is tapered off artificially based on the distance, which gives rise to localized EnKF algorithms. These methods modify the covariance matrix used in the computations and, consequently, the posterior ensemble is no longer made only of linear combinations of the prior ensemble. For nonlinear problems, EnKF can create posterior ensemble with non-physical states. This can be alleviated by
regularization Regularization may refer to: * Regularization (linguistics) * Regularization (mathematics) * Regularization (physics) In physics, especially quantum field theory, regularization is a method of modifying observables which have singularities in ...
, such as penalization of states with large spatial
gradient In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p is the "direction and rate of fastest increase". If the gradi ...
s. For problems with coherent features, such as
hurricane A tropical cyclone is a rapidly rotating storm system characterized by a low-pressure center, a closed low-level atmospheric circulation, strong winds, and a spiral arrangement of thunderstorms that produce heavy rain and squalls. Depend ...
s,
thunderstorm A thunderstorm, also known as an electrical storm or a lightning storm, is a storm characterized by the presence of lightning and its acoustic effect on the Earth's atmosphere, known as thunder. Relatively weak thunderstorms are someti ...
s,
fireline A firebreak or double track (also called a fire line, fuel break, fireroad and firetrail in Australia) is a gap in vegetation or other combustible material that acts as a barrier to slow or stop the progress of a bushfire or wildfire. A firebre ...
s,
squall line A squall line, or more accurately a quasi-linear convective system (QLCS), is a line of thunderstorms, often forming along or ahead of a cold front. In the early 20th century, the term was used as a synonym for cold front (which often are accompa ...
s, and rain fronts, there is a need to adjust the numerical model state by deforming the state in space (its grid) as well as by correcting the state amplitudes additively. In 2007, Ravela et al. introduce the joint position-amplitude adjustment model using ensembles, and systematically derive a sequential approximation which can be applied to both EnKF and other formulations. Their method does not make the assumption that amplitudes and position errors are independent or jointly Gaussian, as others do. The morphing EnKF employs intermediate states, obtained by techniques borrowed from
image registration Image registration is the process of transforming different sets of data into one coordinate system. Data may be multiple photographs, data from different sensors, times, depths, or viewpoints. It is used in computer vision, medical imaging, milit ...
and
morphing Morphing is a special effect in motion pictures and animations that changes (or morphs) one image or shape into another through a seamless transition. Traditionally such a depiction would be achieved through dissolving techniques on film. Sinc ...
, instead of linear combinations of states. Formally, EnKFs rely on the Gaussian assumption. In practice they can also be used for nonlinear problems, where the Gaussian assumption may not be satisfied. Related filters attempting to relax the Gaussian assumption in EnKF while preserving its advantages include filters that fit the state PDF with multiple Gaussian kernels, filters that approximate the state PDF by Gaussian mixtures, a variant of the
particle filter Particle filters, or sequential Monte Carlo methods, are a set of Monte Carlo algorithms used to solve filtering problems arising in signal processing and Bayesian statistical inference. The filtering problem consists of estimating the i ...
with computation of particle weights by
density estimation In statistics, probability density estimation or simply density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function. The unobservable density function is thought of ...
, and a variant of the particle filter with thick tailed data PDF to alleviate particle filter degeneracy.


See also

*
Data assimilation Data assimilation is a mathematical discipline that seeks to optimally combine theory (usually in the form of a numerical model) with observations. There may be a number of different goals sought – for example, to determine the optimal state es ...
* Numerical weather prediction#Ensembles *
Particle filter Particle filters, or sequential Monte Carlo methods, are a set of Monte Carlo algorithms used to solve filtering problems arising in signal processing and Bayesian statistical inference. The filtering problem consists of estimating the i ...
*
Recursive Bayesian estimation In probability theory, statistics, and machine learning, recursive Bayesian estimation, also known as a Bayes filter, is a general probabilistic approach for estimating an unknown probability density function (PDF) recursively over time using inco ...


References


External links


EnKF webpage

TOPAZ, real-time forecasting of the North Atlantic ocean and Arctic sea-ice with the EnKF

EnKF-C, a compact framework for data assimilation into large-scale layered geophysical models with the EnKF

PDAF
Parallel Data Assimilation Framework – an open-source software for data assimilation providing different variants of the EnKF {{DEFAULTSORT:Ensemble Kalman Filter Linear filters Nonlinear filters Bayesian statistics Signal estimation Monte Carlo methods