HOME

TheInfoList



OR:

In
probability theory Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
and
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, a copula is a multivariate
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...
for which the marginal probability distribution of each variable is
uniform A uniform is a variety of costume worn by members of an organization while usually participating in that organization's activity. Modern uniforms are most often worn by armed forces and paramilitary organizations such as police, emergency serv ...
on the interval  , 1 Copulas are used to describe / model the dependence (inter-correlation) between
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s. Their name, introduced by applied mathematician Abe Sklar in 1959, comes from the Latin for "link" or "tie", similar but only metaphoricly related to grammatical copulas in
linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
. Copulas have been used widely in
quantitative finance Mathematical finance, also known as quantitative finance and financial mathematics, is a field of applied mathematics, concerned with mathematical modeling in the financial field. In general, there exist two separate branches of finance that requ ...
to model and minimize tail risk and portfolio-optimization applications. Sklar's theorem states that any multivariate joint distribution can be written in terms of univariate
marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variable ...
functions and a copula which describes the dependence structure between the variables. Copulas are popular in high-dimensional statistical applications as they allow one to easily model and estimate the distribution of random vectors by estimating marginals and copulas separately. There are many parametric copula families available, which usually have parameters that control the strength of dependence. Some popular parametric copula models are outlined below. Two-dimensional copulas are known in some other areas of mathematics under the name ''permutons'' and ''doubly-stochastic measures''.


Mathematical motivation

Consider a random vector \ \bigl(X_1, X_2, \dots, X_d \bigr) ~. Suppose its marginals are continuous, i.e. the marginal CDFs \ F_i(x) = \Pr\bigl X_i \leq x\ \bigr are
continuous function In mathematics, a continuous function is a function such that a small variation of the argument induces a small variation of the value of the function. This implies there are no abrupt changes in value, known as '' discontinuities''. More preci ...
s. By applying the
probability integral transform In probability theory, the probability integral transform (also known as universality of the uniform) relates to the result that data values that are modeled as being random variables from any given continuous distribution can be converted to rando ...
to each component, the random vector : \bigl( U_1, U_2, \dots, U_d \bigr) = \Bigl(\ F_1(X_1), F_2(X_2), \dots, F_d(X_d)\ \Bigr) has marginals that are uniformly distributed on the interval  , 1 The copula of \ \bigl(\ X_1, X_2,\dots, X_d\ \bigr)\ is defined as the joint cumulative distribution function of \ \bigl(\ U_1, U_2, \dots, U_d\ \bigr)\ : :C\!\left(\ u_1, u_2, \dots, u_d\ \right) = \Pr\Bigl U_1 \leq u_1,\ U_2\leq u_2,\ \dots,\ U_d \leq u_d\ \Bigr~. The copula contains all information on the dependence structure between the components of \ \bigl(\ X_1, X_2, \dots, X_d\ \bigr)\ whereas the marginal cumulative distribution functions F_i contain all information on the marginal distributions of \ X_i ~. The reverse of these steps can be used to generate pseudo-random samples from general classes of multivariate probability distributions. That is, given a procedure to generate a sample (U_1,U_2,\dots,U_d) from the copula function, the required sample can be constructed as : \bigl(\ X_1, X_2, \dots, X_d\ \bigr) = \Bigl(\ F_1^(U_1),\ F_2^(U_2),\ \dots,\ F_d^(U_d)\ \Bigr) ~. The generalized inverses \ F_i^\ are unproblematic
almost surely In probability theory, an event is said to happen almost surely (sometimes abbreviated as a.s.) if it happens with probability 1 (with respect to the probability measure). In other words, the set of outcomes on which the event does not occur ha ...
, since the \ F_i\ were assumed to be continuous. Furthermore, the above formula for the copula function can be rewritten as: : C\!\left(\ u_1, u_2, \dots, u_d\ \right)= \Pr\Bigl X_1 \leq F_1^(u_1),\ X_2 \leq F_2^(u_2),\ \dots,\ X_d \leq F_d^(u_d)\ \Bigr~.


Definition

In
probabilistic Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
terms, C: ,1d\rightarrow ,1/math> is a ''d''-dimensional copula if ''C'' is a joint
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...
of a ''d''-dimensional random vector on the
unit cube A unit cube, more formally a cube of side 1, is a cube whose sides are 1 unit long.. See in particulap. 671. The volume of a 3-dimensional unit cube is 1 cubic unit, and its total surface area is 6 square units.. Unit hypercube The term '' ...
,1d with
uniform A uniform is a variety of costume worn by members of an organization while usually participating in that organization's activity. Modern uniforms are most often worn by armed forces and paramilitary organizations such as police, emergency serv ...
marginals. In analytic terms, C: ,1d\rightarrow ,1/math> is a ''d''-dimensional copula if :* C(u_1,\dots,u_,0,u_,\dots,u_d)=0 , the copula is zero if any one of the arguments is zero, :* C(1,\dots,1,u,1,\dots,1)=u , the copula is equal to ''u'' if one argument is ''u'' and all others 1, :* ''C'' is ''d''-non-decreasing, i.e., for each
hyperrectangle In geometry, a hyperrectangle (also called a box, hyperbox, k-cell or orthotopeCoxeter, 1973), is the generalization of a rectangle (a plane figure) and the rectangular cuboid (a solid figure) to higher dimensions. A necessary and sufficient cond ...
B=\prod_^ _i,y_isubseteq ,1d the ''C''-volume of ''B'' is non-negative: :*: \int_B \mathrm C(u) =\sum_ (-1)^ C(\mathbf z)\ge 0, ::where the N(\mathbf z)=\#\. For instance, in the bivariate case, C: ,1\times ,1rightarrow ,1/math> is a bivariate copula if C(0,u) = C(u,0) = 0 , C(1,u) = C(u,1) = u and C(u_2,v_2)-C(u_2,v_1)-C(u_1,v_2)+C(u_1,v_1) \geq 0 for all 0 \leq u_1 \leq u_2 \leq 1 and 0 \leq v_1 \leq v_2 \leq 1.


Sklar's theorem

Sklar's theorem, named after Abe Sklar, provides the theoretical foundation for the application of copulas. Sklar's theorem states that every multivariate cumulative distribution function :H(x_1,\dots,x_d)=\Pr _1\leq x_1,\dots,X_d\leq x_d/math> of a random vector (X_1,X_2,\dots,X_d) can be expressed in terms of its marginals F_i(x_i) = \Pr _i\leq x_i and a copula C. Indeed: :H(x_1,\dots,x_d) = C\left(F_1(x_1),\dots,F_d(x_d) \right). If the multivariate distribution has a density h, and if this density is available, it also holds that :h(x_1,\dots,x_d)= c(F_1(x_1),\dots,F_d(x_d))\cdot f_1(x_1)\cdot\dots\cdot f_d(x_d), where c is the density of the copula. The theorem also states that, given H, the copula is unique on \operatorname(F_1)\times\cdots\times \operatorname(F_d) which is the
cartesian product In mathematics, specifically set theory, the Cartesian product of two sets and , denoted , is the set of all ordered pairs where is an element of and is an element of . In terms of set-builder notation, that is A\times B = \. A table c ...
of the ranges of the marginal cdf's. This implies that the copula is unique if the marginals F_i are continuous. The converse is also true: given a copula C: ,1d\rightarrow ,1 and marginals F_i(x) then C\left(F_1(x_1),\dots,F_d(x_d) \right) defines a ''d''-dimensional cumulative distribution function with marginal distributions F_i(x).


Stationarity condition

Copulas mainly work when time series are stationary and continuous. Thus, a very important pre-processing step is to check for the auto-correlation,
trend A fad, trend, or craze is any form of collective behavior that develops within a culture, a generation, or social group in which a group of people enthusiastically follow an impulse for a short time period. Fads are objects or behaviors th ...
and
seasonality In time series data, seasonality refers to the trends that occur at specific regular intervals less than a year, such as weekly, monthly, or quarterly. Seasonality may be caused by various factors, such as weather, vacation, and holidays and consi ...
within time series. When time series are auto-correlated, they may generate a non existing dependence between sets of variables and result in incorrect copula dependence structure.


Fréchet–Hoeffding copula bounds

The Fréchet–Hoeffding theorem (after Maurice René Fréchet and Wassily Hoeffding) states that for any copula C: ,1d\rightarrow ,1/math> and any (u_1,\dots,u_d)\in ,1d the following bounds hold: : W(u_1,\dots,u_d) \leq C(u_1,\dots,u_d) \leq M(u_1,\dots,u_d). The function is called lower Fréchet–Hoeffding bound and is defined as : W(u_1,\ldots,u_d) = \max\left\. The function is called upper Fréchet–Hoeffding bound and is defined as : M(u_1,\ldots,u_d) = \min \. The upper bound is sharp: is always a copula, it corresponds to comonotone random variables. The lower bound is point-wise sharp, in the sense that for fixed u, there is a copula \tilde such that \tilde(u) = W(u) ~. However, is a copula only in two dimensions, in which case it corresponds to countermonotonic random variables. In two dimensions, i.e. the bivariate case, the Fréchet–Hoeffding theorem states : \max\ \leq C(u,v) \leq \min\.


Families of copulas

Several families of copulas have been described.


Gaussian copula

The Gaussian copula is a distribution over the unit
hypercube In geometry, a hypercube is an ''n''-dimensional analogue of a square ( ) and a cube ( ); the special case for is known as a ''tesseract''. It is a closed, compact, convex figure whose 1- skeleton consists of groups of opposite parallel l ...
,1d. It is constructed from a
multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional ( univariate) normal distribution to higher dimensions. One d ...
over \mathbb^d by using the
probability integral transform In probability theory, the probability integral transform (also known as universality of the uniform) relates to the result that data values that are modeled as being random variables from any given continuous distribution can be converted to rando ...
. For a given
correlation matrix In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
R\in
1, 1 Onekama ( ) is a village in Manistee County in the U.S. state of Michigan. The population was 399 at the 2020 census. The village is located on the northeast shore of Portage Lake and is surrounded by Onekama Township. The town's name is deri ...
, the Gaussian copula with parameter matrix R can be written as : C_R^(u) = \Phi_R\left(\Phi^(u_1),\dots, \Phi^(u_d) \right), where \Phi^ is the inverse cumulative distribution function of a standard normal and \Phi_R is the joint cumulative distribution function of a multivariate normal distribution with mean vector zero and covariance matrix equal to the correlation matrix R. While there is no simple analytical formula for the copula function, C_R^(u), it can be upper or lower bounded, and approximated using numerical integration. The density can be written as : c_R^(u) = \frac\ \exp\left(-\frac \begin\Phi^(u_1)\\ \vdots \\ \Phi^(u_d)\end^\mathsf \, \left(R^-I\right) \, \begin\Phi^(u_1)\\ \vdots \\ \Phi^(u_d)\end \right)\ , where I is the identity matrix.


Archimedean copulas

Archimedean copulas are an associative class of copulas. Most common Archimedean copulas admit an explicit formula, something not possible for instance for the Gaussian copula. In practice, Archimedean copulas are popular because they allow modeling dependence in arbitrarily high dimensions with only one parameter, governing the strength of dependence. A copula C is called Archimedean if it admits the representation : C(\ u_1, \dots, u_d\ ;\ \theta\ ) = \psi^\!\bigl(\ \psi(u_1;\theta)\ +\ \cdots\ +\ \psi(u_d;\theta)\ ;\ \theta\ \bigr) where \ \psi\!: ,1times\Theta \rightarrow monotone on \ [0,\infty) ~. That is, if it is differentiable \ d\ -\ 2\ times, and those derivatives satisfy : (-1)^k\psi^{-1,(k)}(t;\theta) \geq 0 for all \ t\geq 0\ and \ k=0, 1, \dots, d-2\ and (\ -1)^{d-2}\psi^{-1,(d-2)}(t;\theta)\ is a nonincreasing and convex function.


Most important Archimedean copulas

The following tables highlight the most prominent bivariate Archimedean copulas, with their corresponding generator. Not all of them are completely monotone function, completely monotone, i.e. -monotone for all \ d \in \mathbb{N}\ or -monotone for certain \ \theta \in \Theta\ only. {, class="wikitable" , + Table with the most important Archimedean copulas , - ! Name of copula !! Bivariate copula \; C_\theta(u,v)\ !! parameter \ \theta\ ! generator \ \psi_{\theta}(t)\ ! generator inverse \ \psi_{\theta}^{-1}(t)\ , - , Mir Maswood Ali, Ali–Mikhail–Haq , ,   \frac{uv}{1-\theta (1-u)(1-v)} , ,   \theta\in 1,1/math> ,    \log\!\left frac{1-\theta (1-t)}{t}\right/math> ,     \frac{1-\theta}{\exp(t)-\theta} , - , Clayton , ,   \left \max\left\{ u^{-\theta} + v^{-\theta} -1 ; 0 \right\} \right{-1/\theta} , ,   \theta\in[-1,\infty)\backslash\{0\} ,     \frac{1}{\theta}\,(t^{-\theta}-1) ,     \left(1+\theta t\right)^{-1/\theta}     , - , Frank , ,   -\frac{1}{\theta} \log\!\left[ 1+\frac{(\exp(-\theta u)-1)(\exp(-\theta v)-1)}{\exp(-\theta)-1} \right]   , ,   \theta\in \mathbb{R}\backslash\{0\}   ,    -\log\!\left(\frac{\exp(-\theta t)-1}{\exp(-\theta)-1}\right) ,     -\frac{1}{\theta}\,\log(1+\exp(-t)(\exp(-\theta)-1))     , - , Gumbel , ,   \exp\!\left -\left( (-\log(u))^\theta + (-\log(v))^\theta \right)^{1/\theta} \right/math> , ,   \theta\in    \left(-\log(t)\right)^\theta     ,    \exp\!\left(-t^{1/\theta}\right) , - , statistical independence, Independence , ,   uv , ,   ,     -\log(t)     ,    \exp(-t) , - , Joe , ,   {1-\left[ (1-u)^\theta + (1-v)^\theta - (1-u)^\theta(1-v)^\theta \right]^{1/\theta   , ,   \theta\in[1,\infty) ,     -\log\!\left(1-(1-t)^\theta\right)     ,    1-\left(1-\exp(-t)\right)^{1/\theta}


Expectation for copula models and Monte Carlo integration

In statistical applications, many problems can be formulated in the following way. One is interested in the expectation of a response function g:\mathbb{R}^d\rightarrow\mathbb{R} applied to some random vector (X_1,\dots,X_d). If we denote the CDF of this random vector with H, the quantity of interest can thus be written as : \operatorname{E}\left g(X_1,\dots,X_d) \right= \int_{\mathbb{R}^d} g(x_1,\dots,x_d) \, \mathrm{d}H(x_1,\dots,x_d). If H is given by a copula model, i.e., :H(x_1,\dots,x_d)=C(F_1(x_1),\dots,F_d(x_d)) this expectation can be rewritten as :\operatorname{E}\left (X_1,\dots,X_d)\right\int_{ ,1d}g(F_1^{-1}(u_1),\dots,F_d^{-1}(u_d)) \, \mathrm{d}C(u_1,\dots,u_d). In case the copula C is
absolutely continuous In calculus and real analysis, absolute continuity is a smoothness property of functions that is stronger than continuity and uniform continuity. The notion of absolute continuity allows one to obtain generalizations of the relationship betwe ...
, i.e. C has a density c, this equation can be written as :\operatorname{E}\left (X_1,\dots,X_d)\right\int_{ ,1d}g(F_1^{-1}(u_1),\dots,F_d^{-1}(u_d))\cdot c(u_1,\dots,u_d) \, du_1\cdots \mathrm{d}u_d, and if each marginal distribution has the density f_i it holds further that :\operatorname{E}\left (X_1,\dots,X_d)\right\int_{\mathbb{R}^d}g(x_1,\dots x_d)\cdot c(F_1(x_1),\dots,F_d(x_d))\cdot f_1(x_1)\cdots f_d(x_d) \, \mathrm{d}x_1\cdots \mathrm{d}x_d. If copula and marginals are known (or if they have been estimated), this expectation can be approximated through the following Monte Carlo algorithm: # Draw a sample (U_1^k,\dots,U_d^k)\sim C\;\;(k=1,\dots,n) of size n from the copula C # By applying the inverse marginal cdf's, produce a sample of (X_1,\dots,X_d) by setting (X_1^k,\dots,X_d^k)=(F_1^{-1}(U_1^k),\dots,F_d^{-1}(U_d^k))\sim H\;\;(k=1,\dots,n) # Approximate \operatorname{E}\left (X_1,\dots,X_d)\right/math> by its empirical value: :::\operatorname{E}\left (X_1,\dots,X_d)\rightapprox \frac{1}{n}\sum_{k=1}^n g(X_1^k,\dots,X_d^k)


Empirical copulas

When studying multivariate data, one might want to investigate the underlying copula. Suppose we have observations :(X_1^i,X_2^i,\dots,X_d^i), \, i=1,\dots,n from a random vector (X_1,X_2,\dots,X_d) with continuous marginals. The corresponding “true” copula observations would be :(U_1^i,U_2^i,\dots,U_d^i)=\left(F_1(X_1^i),F_2(X_2^i),\dots,F_d(X_d^i)\right), \, i=1,\dots,n. However, the marginal distribution functions F_i are usually not known. Therefore, one can construct pseudo copula observations by using the empirical distribution functions :F_k^n(x)=\frac{1}{n} \sum_{i=1}^n \mathbf{1}(X_k^i\leq x) instead. Then, the pseudo copula observations are defined as :(\tilde{U}_1^i,\tilde{U}_2^i,\dots,\tilde{U}_d^i)=\left(F_1^n(X_1^i),F_2^n(X_2^i),\dots,F_d^n(X_d^i)\right), \, i=1,\dots,n. The corresponding empirical copula is then defined as :C^n(u_1,\dots,u_d) = \frac{1}{n} \sum_{i=1}^n \mathbf{1}\left(\tilde{U}_1^i\leq u_1,\dots,\tilde{U}_d^i\leq u_d\right). The components of the pseudo copula samples can also be written as \tilde{U}_k^i=R_k^i/n, where R_k^i is the rank of the observation X_k^i: :R_k^i=\sum_{j=1}^n \mathbf{1}(X_k^j\leq X_k^i) Therefore, the empirical copula can be seen as the empirical distribution of the rank transformed data. The sample version of Spearman's rho: :r=\frac{12}{n^2-1}\sum_{i=1}^n\sum_{j=1}^n \left ^n \left(\frac{i}{n},\frac{j}{n}\right)-\frac{i}{n}\cdot\frac{j}{n}\right/math>


Applications


Quantitative finance

{, class="wikitable floatright" , width="250" , - style="font-size: 86% , - , Typical finance applications: * Analyzing
systemic risk In finance, systemic risk is the risk of collapse of an entire financial system or entire market, as opposed to the risk associated with any one individual entity, group or component of a system, that can be contained therein without harming the ...
in financial markets * Analyzing and pricing spread options, in particular in fixed income
constant maturity swap A constant maturity swap (CMS) is a swap that allows the purchaser to fix the duration of received flows on a swap. The floating leg of an interest rate swap typically resets against a published index. The floating leg of a constant maturity swap ...
spread options * Analyzing and pricing volatility smile/skew of exotic baskets, e.g. best/worst of * Analyzing and pricing volatility smile/skew of less liquid FX cross, which is effectively a basket: ''C'' = ''S''1/''S''2 or ''C'' = ''S''1·''S''2 * Value-at-Risk forecasting and portfolio optimization to minimize tail risk for US and international equities * Forecasting equities returns for higher-moment portfolio optimization/full-scale optimization * Improving the estimates of a portfolio's expected return and variance-covariance matrix for input into sophisticated mean-variance optimization strategies *
Statistical arbitrage In finance, statistical arbitrage (often abbreviated as Stat Arb or StatArb) is a class of short-term financial trading strategies that employ Mean reversion (finance), mean reversion models involving broadly diversified portfolios of securities (h ...
strategies including pairs trading In
quantitative finance Mathematical finance, also known as quantitative finance and financial mathematics, is a field of applied mathematics, concerned with mathematical modeling in the financial field. In general, there exist two separate branches of finance that requ ...
copulas are applied to
risk management Risk management is the identification, evaluation, and prioritization of risks, followed by the minimization, monitoring, and control of the impact or probability of those risks occurring. Risks can come from various sources (i.e, Threat (sec ...
, to portfolio management and
optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
, and to derivatives pricing. For the former, copulas are used to perform stress-tests and robustness checks that are especially important during "downside/crisis/panic regimes" where extreme downside events may occur (e.g., the
2008 financial crisis The 2008 financial crisis, also known as the global financial crisis (GFC), was a major worldwide financial crisis centered in the United States. The causes of the 2008 crisis included excessive speculation on housing values by both homeowners ...
). The formula was also adapted for financial markets and was used to estimate the
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
of losses on pools of loans or bonds. During a downside regime, a large number of investors who have held positions in riskier assets such as equities or real estate may seek refuge in 'safer' investments such as cash or bonds. This is also known as a
flight-to-quality A flight-to-quality, or flight-to-safety, is a financial market phenomenon occurring when investors sell what they perceive to be higher-risk investments and purchase safer investments, such as Gold as an investment, gold and Government bond, gover ...
effect and investors tend to exit their positions in riskier assets in large numbers in a short period of time. As a result, during downside regimes, correlations across equities are greater on the downside as opposed to the upside and this may have disastrous effects on the economy. For example, anecdotally, we often read financial news headlines reporting the loss of hundreds of millions of dollars on the stock exchange in a single day; however, we rarely read reports of positive stock market gains of the same magnitude and in the same short time frame. Copulas aid in analyzing the effects of downside regimes by allowing the modelling of the marginals and dependence structure of a multivariate probability model separately. For example, consider the stock exchange as a market consisting of a large number of traders each operating with his/her own strategies to maximize profits. The individualistic behaviour of each trader can be described by modelling the marginals. However, as all traders operate on the same exchange, each trader's actions have an interaction effect with other traders'. This interaction effect can be described by modelling the dependence structure. Therefore, copulas allow us to analyse the interaction effects which are of particular interest during downside regimes as investors tend to herd their trading behaviour and decisions. (See also
agent-based computational economics Agent-based computational economics (ACE) is the area of computational economics that studies economic processes, including whole economies, as dynamic systems of interacting agents. As such, it falls in the paradigm of complex adaptive systems. ...
, where price is treated as an
emergent phenomenon In philosophy, systems theory, science, and art, emergence occurs when a complex entity has properties or behaviors that its parts do not have on their own, and emerge only when they interact in a wider whole. Emergence plays a central role ...
, resulting from the interaction of the various market participants, or agents.) The users of the formula have been criticized for creating "evaluation cultures" that continued to use simple copulæ despite the simple versions being acknowledged as inadequate for that purpose. Thus, previously, scalable copula models for large dimensions only allowed the modelling of elliptical dependence structures (i.e., Gaussian and Student-t copulas) that do not allow for correlation asymmetries where correlations differ on the upside or downside regimes. However, the development of vine copulas (also known as ''pair copulas'') enables the flexible modelling of the dependence structure for portfolios of large dimensions. The Clayton canonical vine copula allows for the occurrence of extreme downside events and has been successfully applied in portfolio optimization and risk management applications. The model is able to reduce the effects of extreme downside correlations and produces improved statistical and economic performance compared to scalable elliptical dependence copulas such as the Gaussian and Student-t copula. Other models developed for risk management applications are panic copulas that are glued with market estimates of the marginal distributions to analyze the effects of panic regimes on the portfolio profit and loss distribution. Panic copulas are created by Monte Carlo simulation, mixed with a re-weighting of the probability of each scenario. As regards derivatives pricing, dependence modelling with copula functions is widely used in applications of financial risk assessment and actuarial analysis – for example in the pricing of
collateralized debt obligation A collateralized debt obligation (CDO) is a type of structured finance, structured asset-backed security (ABS). Originally developed as instruments for the corporate debt markets, after 2002 CDOs became vehicles for refinancing Mortgage-backed se ...
s (CDOs). Some believe the methodology of applying the Gaussian copula to credit derivatives to be one of the causes of the
2008 financial crisis The 2008 financial crisis, also known as the global financial crisis (GFC), was a major worldwide financial crisis centered in the United States. The causes of the 2008 crisis included excessive speculation on housing values by both homeowners ...
; see . Despite this perception, there are documented attempts within the financial industry, occurring before the crisis, to address the limitations of the Gaussian copula and of copula functions more generally, specifically the lack of dependence dynamics. The Gaussian copula is lacking as it only allows for an elliptical dependence structure, as dependence is only modeled using the variance-covariance matrix. This methodology is limited such that it does not allow for dependence to evolve as the financial markets exhibit asymmetric dependence, whereby correlations across assets significantly increase during downturns compared to upturns. Therefore, modeling approaches using the Gaussian copula exhibit a poor representation of extreme events. There have been attempts to propose models rectifying some of the copula limitations. Additional to CDOs, copulas have been applied to other asset classes as a flexible tool in analyzing multi-asset derivative products. The first such application outside credit was to use a copula to construct a
basket A basket is a container that is traditionally constructed from stiff Fiber, fibers, and can be made from a range of materials, including wood splints, Stolon, runners, and cane. While most baskets are made from plant materials, other materials ...
implied volatility surface, taking into account the
volatility smile Volatility smiles are implied volatility patterns that arise in pricing financial options. It is a parameter (implied volatility) that is needed to be modified for the Black–Scholes formula to fit market prices. In particular for a given ex ...
of basket components. Copulas have since gained popularity in pricing and risk management of options on multi-assets in the presence of a volatility smile, in equity-, foreign exchange- and fixed income derivatives.


Civil engineering

Recently, copula functions have been successfully applied to the database formulation for the reliability analysis of highway bridges, and to various multivariate
simulation A simulation is an imitative representation of a process or system that could exist in the real world. In this broad sense, simulation can often be used interchangeably with model. Sometimes a clear distinction between the two terms is made, in ...
studies in civil engineering, reliability of wind and earthquake engineering, and mechanical & offshore engineering. Researchers are also trying these functions in the field of transportation to understand the interaction between behaviors of individual drivers which, in totality, shapes traffic flow.


Reliability engineering

Copulas are being used for reliability analysis of complex systems of machine components with competing failure modes.


Warranty data analysis

Copulas are being used for
warranty In law, a warranty is an expressed or implied promise or assurance of some kind. The term's meaning varies across legal subjects. In property law, it refers to a covenant by the grantor of a deed. In insurance law, it refers to a promise by the ...
data analysis in which the tail dependence is analysed.


Turbulent combustion

Copulas are used in modelling turbulent partially premixed combustion, which is common in practical combustors.


Medicine

Copulæ have many applications in the area of
medicine Medicine is the science and Praxis (process), practice of caring for patients, managing the Medical diagnosis, diagnosis, prognosis, Preventive medicine, prevention, therapy, treatment, Palliative care, palliation of their injury or disease, ...
, for example, # Copulæ have been used in the field of
magnetic resonance imaging Magnetic resonance imaging (MRI) is a medical imaging technique used in radiology to generate pictures of the anatomy and the physiological processes inside the body. MRI scanners use strong magnetic fields, magnetic field gradients, and ...
(MRI), for example, to segment images, to fill a vacancy of graphical models in imaging
genetics Genetics is the study of genes, genetic variation, and heredity in organisms.Hartl D, Jones E (2005) It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinians, Augustinian ...
in a study on
schizophrenia Schizophrenia () is a mental disorder characterized variously by hallucinations (typically, Auditory hallucination#Schizophrenia, hearing voices), delusions, thought disorder, disorganized thinking and behavior, and Reduced affect display, f ...
, and to distinguish between normal and
Alzheimer Alzheimer's disease (AD) is a neurodegenerative disease and the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in remembering recent events. As the disease advances, symptoms can include problems wit ...
patients. # Copulæ have been in the area of
brain research ''Brain Research'' is a peer-reviewed scientific journal focusing on several aspects of neuroscience. It publishes research reports and " minireviews". The editor-in-chief is Matthew J. LaVoie (University of Florida). Until 2011, full reviews were ...
based on EEG signals, for example, to detect drowsiness during daytime nap, to track changes in instantaneous equivalent bandwidths (IEBWs), to derive synchrony for early diagnosis of
Alzheimer's disease Alzheimer's disease (AD) is a neurodegenerative disease and the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in remembering recent events. As the disease advances, symptoms can include problems wit ...
, to characterize dependence in oscillatory activity between EEG channels, and to assess the reliability of using methods to capture dependence between pairs of EEG channels using their time-varying envelopes. Copula functions have been successfully applied to the analysis of neuronal dependencies and spike counts in neuroscience . # A copula model has been developed in the field of
oncology Oncology is a branch of medicine that deals with the study, treatment, diagnosis, and prevention of cancer. A medical professional who practices oncology is an ''oncologist''. The name's Etymology, etymological origin is the Greek word ὄγ ...
, for example, to jointly model
genotype The genotype of an organism is its complete set of genetic material. Genotype can also be used to refer to the alleles or variants an individual carries in a particular gene or genetic location. The number of alleles an individual can have in a ...
s,
phenotype In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology (physical form and structure), its developmental processes, its biochemical and physiological propert ...
s, and pathways to reconstruct a cellular network to identify interactions between specific phenotype and multiple molecular features (e.g.
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, ...
s and
gene expression Gene expression is the process (including its Regulation of gene expression, regulation) by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, ...
change). Bao et al. used NCI60 cancer cell line data to identify several subsets of molecular features that jointly perform as the predictors of clinical phenotypes. The proposed copula may have an impact on
biomedical Biomedicine (also referred to as Western medicine, mainstream medicine or conventional medicine)
research, ranging from
cancer Cancer is a group of diseases involving Cell growth#Disorders, abnormal cell growth with the potential to Invasion (cancer), invade or Metastasis, spread to other parts of the body. These contrast with benign tumors, which do not spread. Po ...
treatment to disease prevention. Copula has also been used to predict the histological diagnosis of colorectal lesions from
colonoscopy Colonoscopy () or coloscopy () is a medical procedure involving the Endoscopy, endoscopic examination of the large bowel (colon) and the distal portion of the small bowel. This examination is performed using either a Charge-coupled device, CCD ...
images, and to classify cancer subtypes. #A copula-based analysis model has been developed in the field of heart and cardiovascular disease, for example, to predict heart rate (HR) variation. Heart rate (HR) is one of the most critical health indicators for monitoring exercise intensity and load degree because it is closely related to heart rate. Therefore, an accurate short-term HR prediction technique can deliver efficient early warning for human health and decrease harmful events. Namazi (2022) used a novel hybrid algorithm to predict HR.


Geodesy

The combination of SSA and copula-based methods have been applied for the first time as a novel stochastic tool for Earth Orientation Parameters prediction.


Hydrology research

Copulas have been used in both theoretical and applied analyses of hydroclimatic data. Theoretical studies adopted the copula-based methodology for instance to gain a better understanding of the dependence structures of temperature and precipitation, in different parts of the world. Applied studies adopted the copula-based methodology to examine e.g., agricultural droughts or joint effects of temperature and precipitation extremes on vegetation growth.


Climate and weather research

Copulas have been extensively used in climate- and weather-related research.


Solar irradiance variability

Copulas have been used to estimate the solar irradiance variability in spatial networks and temporally for single locations.


Random vector generation

Large synthetic traces of vectors and stationary time series can be generated using empirical copula while preserving the entire dependence structure of small datasets. Such empirical traces are useful in various simulation-based performance studies.


Ranking of electrical motors

Copulas have been used for quality ranking in the manufacturing of electronically commutated motors.


Signal processing

Copulas are important because they represent a dependence structure without using
marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variable ...
s. Copulas have been widely used in the field of
finance Finance refers to monetary resources and to the study and Academic discipline, discipline of money, currency, assets and Liability (financial accounting), liabilities. As a subject of study, is a field of Business administration, Business Admin ...
, but their use in
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, Scalar potential, potential fields, Seismic tomograph ...
is relatively new. Copulas have been employed in the field of
wireless Wireless communication (or just wireless, when the context allows) is the transfer of information (''telecommunication'') between two or more points without the use of an electrical conductor, optical fiber or other continuous guided transm ...
communication Communication is commonly defined as the transmission of information. Its precise definition is disputed and there are disagreements about whether Intention, unintentional or failed transmissions are included and whether communication not onl ...
for classifying
radar Radar is a system that uses radio waves to determine the distance ('' ranging''), direction ( azimuth and elevation angles), and radial velocity of objects relative to the site. It is a radiodetermination method used to detect and track ...
signals, change detection in
remote sensing Remote sensing is the acquisition of information about an physical object, object or phenomenon without making physical contact with the object, in contrast to in situ or on-site observation. The term is applied especially to acquiring inform ...
applications, and EEG
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, Scalar potential, potential fields, Seismic tomograph ...
in
medicine Medicine is the science and Praxis (process), practice of caring for patients, managing the Medical diagnosis, diagnosis, prognosis, Preventive medicine, prevention, therapy, treatment, Palliative care, palliation of their injury or disease, ...
. In this section, a short mathematical derivation to obtain copula density function followed by a table providing a list of copula density functions with the relevant signal processing applications are presented.


Astronomy

Copulas have been used for determining the core radio luminosity function of Active galactic Nuclei (AGNs), while this cannot be realized using traditional methods due to the difficulties in sample completeness.


Mathematical derivation of copula density function

For any two random variables ''X'' and ''Y'', the continuous joint
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
function can be written as : F_{XY}(x,y) = \Pr \begin{Bmatrix} X \leq{x},Y\leq{y} \end{Bmatrix}, where F_X(x) = \Pr \begin{Bmatrix} X \leq{x} \end{Bmatrix} and F_Y(y) = \Pr \begin{Bmatrix} Y \leq{y} \end{Bmatrix} are the marginal cumulative distribution functions of the random variables ''X'' and ''Y'', respectively. then the copula distribution function C(u, v) can be defined using Sklar's theorem as: : F_{XY}(x,y) = C( F_X (x) , F_Y (y) ) \triangleq C( u, v ), where u = F_X(x) and v = F_Y(y) are marginal distribution functions, F_{XY}(x,y) joint and u, v \in (0,1) . Assuming F_{XY}(\cdot,\cdot) is a.e. twice differentiable, we start by using the relationship between joint probability density function (PDF) and joint cumulative distribution function (CDF) and its partial derivatives. :\begin{alignat}{6} f_{XY}(x,y) = {} & {\partial^2 F_{XY}(x,y) \over\partial x\,\partial y } \\ \vdots \\ f_{XY}(x,y) = {} & {\partial^2 C(F_X(x),F_Y(y)) \over\partial x\,\partial y} \\ \vdots \\ f_{XY}(x,y) = {} & {\partial^2 C(u,v) \over\partial u\,\partial v} \cdot {\partial F_X(x) \over\partial x} \cdot {\partial F_Y(y) \over\partial y} \\ \vdots \\ f_{XY}(x,y) = {} & c(u,v) f_X(x) f_Y(y) \\ \vdots \\ \frac{f_{XY}(x,y)}{f_X(x) f_Y(y) } = {} & c(u,v) \end{alignat} where c(u,v) is the copula density function, f_X(x) and f_Y(y) are the marginal probability density functions of ''X'' and ''Y'', respectively. There are four elements in this equation, and if any three elements are known, the fourth element can be calculated. For example, it may be used, * when joint probability density function between two random variables is known, the copula density function is known, and one of the two marginal functions are known, then, the other marginal function can be calculated, or * when the two marginal functions and the copula density function are known, then the joint probability density function between the two random variables can be calculated, or * when the two marginal functions and the joint probability density function between the two random variables are known, then the copula density function can be calculated.


List of copula density functions and applications

Various bivariate copula density functions are important in the area of signal processing. u=F_X(x) and v=F_Y(y) are marginal distributions functions and f_X(x) and f_Y(y) are marginal density functions. Extension and generalization of copulas for statistical signal processing have been shown to construct new bivariate copulas for exponential, Weibull, and Rician distributions. Zeng et al. presented algorithms, simulation, optimal selection, and practical applications of these copulas in signal processing. {, class="wikitable" ! ! scope="col" style="width: 750px;" , Copula density: ''c''(''u'', ''v'') !Use , - , Gaussian , \begin{align} = {} & \frac{1}{\sqrt{1-\rho^2 \exp\left (-\frac{(a^2+b^2)\rho^2-2 ab\rho}{ 2(1-\rho^2) } \right ) \\ & \text{where } \rho\in (-1,1)\\ & \text{where } a=\sqrt{2} \operatorname{erf}^{-1}(2u-1) \\ & \text{where } b =\sqrt{2}\operatorname{erf}^{-1}(2v-1) \\ & \text{where } \operatorname{erf}(z) = \frac{2}{\sqrt\pi} \int\limits_0^z \exp (-t^2) \, dt \end{align} , supervised classification of synthetic aperture radar (SAR) images, validating biometric authentication, modeling stochastic dependence in large-scale integration of wind power, unsupervised classification of radar signals , - , Exponential , \begin{align} = {} & \frac{1}{1-\rho} \exp\left ( \frac{\rho(\ln(1-u)+\ln(1-v))}{1-\rho} \right ) \cdot I_0\left ( \frac{2\sqrt{\rho \ln(1-u)\ln(1-v){1-\rho} \right )\\ & \text{where } x=F_X^{-1}(u)=-\ln(1-u)/\lambda \\ & \text{where } y=F_Y^{-1}(v)=-\ln(1-v)/\mu \end{align} , queuing system with infinitely many servers , - , Rayleigh , bivariate exponential, Rayleigh, and Weibull copulas have been proved to be equivalent , change detection from SAR images , - , Weibull , bivariate exponential, Rayleigh, and Weibull copulas have been proved to be equivalent , digital communication over fading channels , - , Log-normal , bivariate log-normal copula and Gaussian copula are equivalent , shadow fading along with multipath effect in wireless channel , - , Farlie–Gumbel–Morgenstern (FGM) , \begin{align} = {} & 1+\theta(1-2u)(1-2v) \\ & \text{where } \theta \in 1,1\end{align} , information processing of uncertainty in knowledge-based systems , - , Clayton , \begin{align} = {} & (1+\theta)(uv)^{(-1-\theta)}(-1 +u^{-\theta} + v^{-\theta})^{(-2-1/\theta)} \\ & \text{where } \theta \in(-1,\infty), \theta\neq0 \end{align} , location estimation of random signal source and hypothesis testing using heterogeneous data , - , Frank , \begin{align} = {} & \frac {-\theta e^{-\theta(u+v)}(e^{-\theta}-1)} {(e^{-\theta}-e^{-\theta u}-e^{-\theta v}+e^{-\theta(u+v)})^2}\\ & \text{where } \theta \in(-\infty,+\infty), \theta\neq0 \end{align} , quantitative risk assessment of geo-hazards , - , Student's t , \begin{align} = {} & \frac{\Gamma(0.5v)\Gamma(0.5v+1)( 1+(t_v^{-2}(u)+t_v^{-2}(v) -2 \rho t_v^{-1}(u) t_v^{-1}(v))/(v(1-\rho^2)))^{-0.5(v+2)} )} {\sqrt{1-\rho^2} \cdot \Gamma(\frac{v+1}{2})^2 (1+ t_v^{-2}(u)/v)^{-\frac{v+1}{2 (1+ t_v^{-2}(v)/v)^{-\frac{v+1}{2 } \\ & \text{where } \rho\in (-1,1)\\ & \text{where } \phi(z)= \frac{1}{\sqrt{2\pi \int\limits_{-\infty}^z \exp \left(\frac{-t^2}{2}\right) \, dt \\ & \text{where } t_v(x\mid v)= \int\limits_{-\infty}^x \frac{\Gamma{(\frac{v+1}{2}){\sqrt{v\pi}\Gamma{(\frac{v}{2})}(1+\frac{t^2}{v})^{\frac{v+1}{2} dt\\ & \text{where } v \text{ is the number of degrees of freedom} \\ & \text{where } \Gamma \text{ is the Gamma function} \end{align} , supervised SAR image classification, fusion of correlated sensor decisions , - , Nakagami-m , , , - , Rician , ,


See also

* Coupling (probability)


References


Further reading

* The standard reference for an introduction to copulas. Covers all fundamental aspects, summarizes the most popular copula classes, and provides proofs for the important theorems related to copulas :: * A book covering current topics in mathematical research on copulas: :: * A reference for sampling applications and stochastic models related to copulas is :: * A paper covering the historic development of copula theory, by the person associated with the "invention" of copulas, Abe Sklar. :: * A standard reference for multivariate models and copula theory in the context of financial and insurance models is ::


External links

* * * — A collection of Copula simulation and estimation codes * — articles on Copulas & correlation using Excel simulation * — free copy of chapter 1 from publisher {{DEFAULTSORT:Copula (Statistics) Actuarial science Multivariate statistics Independence (probability theory) Systems of probability distributions