Exponential family random graph models (ERGMs) are a family of
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
s for analyzing data from
social
Social organisms, including human(s), live collectively in interacting populations. This interaction is considered social whether they are aware of it or not, and whether the exchange is voluntary or not.
Etymology
The word "social" derives from ...
and
other networks. Examples of networks examined using ERGM include knowledge networks, organizational networks, colleague networks, social media networks, networks of scientific development, and others.
Background
Many metrics exist to describe the structural features of an observed network such as the density, centrality, or assortativity. However, these metrics describe the observed network which is only one instance of a large number of possible alternative networks. This set of alternative networks may have similar or dissimilar structural features. To support
statistical inference
Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properti ...
on the processes influencing the formation of network structure, a
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
should consider the set of all possible alternative networks weighted on their similarity to an observed network. However because network data is inherently relational, it violates the assumptions of independence and identical distribution of standard statistical models like
linear regression
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
.
Alternative statistical models should reflect the uncertainty associated with a given observation, permit inference about the relative frequency about network substructures of theoretical interest, disambiguating the influence of confounding processes, efficiently representing complex structures, and linking local-level processes to global-level properties.
Degree-preserving randomization, for example, is a specific way in which an observed network could be considered in terms of multiple alternative networks.
Definition
The
Exponential family is a broad family of models for covering many types of data, not just networks. An ERGM is a model from this family which describes networks.
Formally a
random graph consists of a set of
nodes and
dyads (edges)
where
if the nodes
are connected and
otherwise.
The basic assumption of these models is that the structure in an observed graph
can be explained by a given vector of
sufficient statistics
In statistics, a statistic is ''sufficient'' with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the pa ...
which are a function of the observed network and, in some cases, nodal attributes. This way, it is possible to describe any kind of dependence between the undyadic variables:
where
is a vector of model parameters associated with
and
is a normalising constant.
These models represent a probability distribution on each possible network on
nodes. However, the size of the set of possible networks for an undirected network (simple graph) of size
is
. Because the number of possible networks in the set vastly outnumbers the number of parameters which can constrain the model, the ideal probability distribution is the one which maximizes the
Gibbs entropy.
References
Further reading
*
*
*
*
*
*
*Harris, Jenine K (2014). An introduction to exponential random graph modeling. Sage.
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*{{cite journal, last1=van Duijn , first1=M. A. J. , last2=Gile , first2=K. J. , author2-link = Krista Gile , last3=Handcock , first3=M. S. , year=2009 , title=A framework for the comparison of maximum pseudo-likelihood and maximum likelihood estimation of exponential family random graph models , journal=Social Networks , volume=31 , issue=1 , pages=52–62 , doi=10.1016/j.socnet.2008.10.003, pmid=23170041 , pmc=3500576
Network theory