spatial statistics Spatial statistics is a field of applied statistics dealing with spatial data. It involves stochastic processes (random fields, point processes), sampling, smoothing and interpolation, regional ( areal unit) and lattice ( gridded) data, poin ...

the theoretical variogram, denoted

2\gamma(\mathbf_1,\mathbf_2)

, is a function describing the degree of

spatial dependence Spatial analysis is any of the formal techniques which study entities using their topological, geometric, or geographic properties, primarily used in Urban Design. Spatial analysis includes a variety of techniques using different analytic appro ...

of a spatial

random field In physics and mathematics, a random field is a random function over an arbitrary domain (usually a multi-dimensional space such as \mathbb^n). That is, it is a function f(x) that takes on a random value at each point x \in \mathbb^n(or some other ...

stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Sto ...

Z(\mathbf)

. The semivariogram

\gamma(\mathbf_1,\mathbf_2)

is half the variogram. Schematic variogram

For example, in

gold mining Gold mining is the extraction of gold by mining. Historically, mining gold from Alluvium, alluvial deposits used manual separation processes, such as gold panning. The expansion of gold mining to ores that are not on the surface has led to mor ...

, a variogram will give a measure of how much two samples taken from the mining area will vary in gold percentage depending on the distance between those samples. Samples taken far apart will vary more than samples taken close to each other.

Definition

The semivariogram

\gamma(h)

was first defined by Matheron (1963) as half the average squared difference between a function and a translated copy of the function separated at distance

h

. Formally :

\gamma(h)=\frac\iiint_V \left (M+h) - f(M) \right 2dM,

where

M

is a point in the geometric field

V

, and

f(M)

is the value at that point. The triple integral is over 3 dimensions.

h

is the separation distance (e.g., in meters or km) of interest. For example, the value

f(M)

could represent the iron content in soil, at some location

M

(with

geographic coordinates A geographic coordinate system (GCS) is a spherical or geodetic coordinate system for measuring and communicating positions directly on Earth as latitude and longitude. It is the simplest, oldest, and most widely used type of the various ...

of latitude, longitude, and elevation) over some region

V

with element of volume

dV

. To obtain the semivariogram for a given

\gamma(h)

, all pairs of points at that exact distance would be sampled. In practice it is impossible to sample everywhere, so the empirical variogram is used instead. The variogram is twice the semivariogram and can be defined, differently, as the

variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...

of the difference between field values at two locations (

\mathbf_1

and

\mathbf_2

, note change of notation from

M

\mathbf

and

f

Z

) across realizations of the field (Cressie 1993): :

)^2\right">(\mathbf_1)_-_Z(\mathbf_2).html" ;"title="(Z(\mathbf_1)-Z(\mathbf_2)) - E[Z(\mathbf_1) - Z(\mathbf_2)">(Z(\mathbf_1)-Z(\mathbf_2)) - E[Z(\mathbf_1) - Z(\mathbf_2))^2\right

If the spatial random field has constant mean

\mu

, this is equivalent to the expectation for the squared increment of the values between locations

\mathbf_1

and

s_2

(Wackernagel 2003) (where

\mathbf_1

and

\mathbf_2

are points in space and possibly time): :

2\gamma(\mathbf_1,\mathbf_2)=E\left[\left(Z(\mathbf_1)-Z(\mathbf_2)\right)^2\right] .

In the case of a stationary process, the variogram and semivariogram can be represented as a function

\gamma_s(h)=\gamma(0,0+h)

of the difference

h=\mathbf_2-\mathbf_1

between locations only, by the following relation (Cressie 1993): :

\gamma(\mathbf_1,\mathbf_2)=\gamma_s(\mathbf_2-\mathbf_1).

If the process is furthermore

isotropic In physics and geometry, isotropy () is uniformity in all orientations. Precise definitions depend on the subject area. Exceptions, or inequalities, are frequently indicated by the prefix ' or ', hence '' anisotropy''. ''Anisotropy'' is also ...

, then the variogram and semivariogram can be represented by a function

\gamma_i(h):=\gamma_s(h e_1)

of the distance

h=\, \mathbf_2-\mathbf_1\,

only (Cressie 1993): :

\gamma(\mathbf_1,\mathbf_2)=\gamma_i(h).

The indexes

i

s

are typically not written. The terms are used for all three forms of the function. Moreover, the term "variogram" is sometimes used to denote the semivariogram, and the symbol

\gamma

is sometimes used for the variogram, which brings some confusion.

Properties

According to (Cressie 1993, Chiles and Delfiner 1999, Wackernagel 2003) the theoretical variogram has the following properties: * The semivariogram is nonnegative

\gamma(\mathbf_1,\mathbf_2)\geq 0

, since it is the expectation of a square. * The semivariogram

\gamma(\mathbf_1,\mathbf_1)=\gamma_i(0)=E\left((Z(\mathbf_1)-Z(\mathbf_1))^2\right)=0

at distance 0 is always 0, since

Z(\mathbf_1)-Z(\mathbf_1)=0

. * A function is a semivariogram if and only if it is a conditionally negative definite function, i.e. for all weights

w_1,\ldots,w_N

subject to

\sum_^N w_i=0

and locations

s_1,\ldots,s_N

it holds: ::

\sum_^N\sum_^N w_\gamma(\mathbf_i,\mathbf_j)w_j \leq 0

: which corresponds to the fact that the variance

\operatorname(X)

X=\sum_^N w_i Z(x_i)

is given by the negative of this double sum and must be nonnegative. * If the

covariance function In probability theory and statistics, the covariance function describes how much two random variables change together (their ''covariance'') with varying spatial or temporal separation. For a random field or stochastic process ''Z''(''x'') on a dom ...

''C'' of a stationary process exists, it is related to variogram by :

2\gamma(\mathbf_1,\mathbf_2)=C(\mathbf_1,\mathbf_1)+C(\mathbf_2,\mathbf_2)-2C(\mathbf_1,\mathbf_2)

* If the

''V'' and

correlation function A correlation function is a function that gives the statistical correlation between random variables, contingent on the spatial or temporal distance between those variables. If one considers the correlation function between random variables ...

''c'' of a stationary process exist, they are related to semivariogram by :

\gamma(\mathbf_1,\mathbf_2)=V(1 - c(\mathbf_1,\mathbf_2))

* Conversely, the covariance function ''C'' of a stationary process can be obtained from the semivariogram and variance as :

C(\mathbf_1,\mathbf_2)=V-\gamma(\mathbf_1,\mathbf_2)

* If a stationary random field has no spatial dependence (i.e.

C(h)=0

h\not= 0

), the semivariogram is the constant

\operatorname(Z(\mathbf))

everywhere except at the origin, where it is zero. * The semivariogram is a symmetric function,

\gamma(\mathbf_1,\mathbf_2)=E\left even function

In mathematics, an even function is a real function such that f(-x)=f(x) for every x in its domain. Similarly, an odd function is a function such that f(-x)=-f(x) for every x in its domain. They are named for the parity of the powers of the ...

Definition

Properties

Parameters

Empirical variogram

Variogram models

Discussion

Applications

Related concepts

References

Further reading

External links