HOME

TheInfoList



OR:

Exponential family random graph models (ERGMs) are a family of
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
s for analyzing data from
social Social organisms, including human(s), live collectively in interacting populations. This interaction is considered social whether they are aware of it or not, and whether the exchange is voluntary or not. Etymology The word "social" derives from ...
and other networks. Examples of networks examined using ERGM include knowledge networks, organizational networks, colleague networks, social media networks, networks of scientific development, and others.


Background

Many metrics exist to describe the structural features of an observed network such as the density, centrality, or assortativity. However, these metrics describe the observed network which is only one instance of a large number of possible alternative networks. This set of alternative networks may have similar or dissimilar structural features. To support
statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properti ...
on the processes influencing the formation of network structure, a
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
should consider the set of all possible alternative networks weighted on their similarity to an observed network. However because network data is inherently relational, it violates the assumptions of independence and identical distribution of standard statistical models like
linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
. Alternative statistical models should reflect the uncertainty associated with a given observation, permit inference about the relative frequency about network substructures of theoretical interest, disambiguating the influence of confounding processes, efficiently representing complex structures, and linking local-level processes to global-level properties. Degree-preserving randomization, for example, is a specific way in which an observed network could be considered in terms of multiple alternative networks.


Definition

The Exponential family is a broad family of models for covering many types of data, not just networks. An ERGM is a model from this family which describes networks. Formally a random graph Y \in \mathcal consists of a set of n nodes and m dyads (edges) \ where Y_=1 if the nodes (i,j) are connected and Y_=0 otherwise. The basic assumption of these models is that the structure in an observed graph y can be explained by a given vector of
sufficient statistics In statistics, a statistic is ''sufficient'' with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the pa ...
s(y) which are a function of the observed network and, in some cases, nodal attributes. This way, it is possible to describe any kind of dependence between the undyadic variables: P(Y = y , \theta) = \frac,\quad\forall y\in\mathcal where \theta is a vector of model parameters associated with s(y) and c(\theta) = \sum_\exp(\theta^ s(y')) is a normalising constant. These models represent a probability distribution on each possible network on n nodes. However, the size of the set of possible networks for an undirected network (simple graph) of size n is 2^. Because the number of possible networks in the set vastly outnumbers the number of parameters which can constrain the model, the ideal probability distribution is the one which maximizes the Gibbs entropy.


References


Further reading

* * * * * * *Harris, Jenine K (2014). An introduction to exponential random graph modeling. Sage. * * * * * * * * * * * * * * * *{{cite journal, last1=van Duijn , first1=M. A. J. , last2=Gile , first2=K. J. , author2-link = Krista Gile , last3=Handcock , first3=M. S. , year=2009 , title=A framework for the comparison of maximum pseudo-likelihood and maximum likelihood estimation of exponential family random graph models , journal=Social Networks , volume=31 , issue=1 , pages=52–62 , doi=10.1016/j.socnet.2008.10.003, pmid=23170041 , pmc=3500576 Network theory