Rationale for the model
In the configuration model, the degree of each vertex is pre-defined, rather than having a probability distribution from which the given degree is chosen. As opposed to the Erdős–Rényi model, the degree sequence of the configuration model is not restricted to have a Poisson distribution, the model allows the user to give the network any desired degree distribution.Algorithm
The following algorithm describes the generation of the model: # Take a degree sequence, i. e. assign a degree to each vertex. The degrees of the vertices are represented as half-links or stubs. The sum of stubs must be even in order to be able to construct a graph (). The degree sequence can be drawn from a theoretical distribution or it can represent a real network (determined from the adjacency matrix of the network). # Choose two stubs uniformly at random and connect them to form an edge. Choose another pair from the remaining stubs and connect them. Continue until you run out of stubs. The result is a network with the pre-defined degree sequence. The realization of the network changes with the order in which the stubs are chosen, they might include cycles (b), self-loops (c) or multi-links (d) (Figure 1). Yet, the expected number of self-loops and multi-links goes to zero in the ''N'' → ∞ limit.Self-loops, multi-edges and implications
The algorithm described above matches any stubs with the same probability. The uniform distribution of the matching is an important property in terms of calculating other features of the generated networks. The network generation process does not exclude the event of generating a self-loop or a multi-link. If we designed the process where self-loops and multi-edges are not allowed, the matching of the stubs would not follow a uniform distribution. The expected total number of multi-links in a configuration model network would be: where is the n-thProperties
Edge probability
A stub of node can be connected to other stubs (there are stubs altogether, and we have to exclude the one we are currently observing). The vertex has stubs to which node can be connected with the same probability (because of the uniform distribution). The probability of a stub of node being connected to one of these stubs is . Since node has stubs, the probability of being connected to is ( for sufficiently large ). Note that this formula can only be viewed as a probability if , and more precisely it describes the expected number of edges between nodes and . Note that this formula does not apply to the case of self-edges. Given a configuration model with a degree distribution , the probability of a randomly chosen node having degree is . But if we took one of the vertices to which we can arrive following one of edges of i, the probability of having degree k is . (The probability of reaching a node with degree k is , and there are such nodes.) This fraction depends on as opposed to the degree of the typical node with . Thus, a neighbor of a typical node is expected to have higher degree than the typical node itself. This feature of the configuration model describes well the phenomenon of "my friends having more friends than I do".Clustering coefficient
The clustering coefficient (the average probability that the neighbors of a node are connected) is computed approximately as follows: : where denotes the probability that a random edge reaches a degree- vertex, and the factors of the form "" rather than "" appear because one stub has been accounted for by the fact that these are neighbors of a common vertex. Evaluating the above results in : Using and , with denoting the degree distribution, denoting the average degree, and denoting the number of vertices, the above becomes : with denoting the second moment of the degree distribution. Assuming that and are constant, the above behaves as : where the constant depends on . Thus, the clustering coefficient becomes small in the limit.Giant component
In the configuration model, aDiameter
Configuration model can assume any degree distribution and shows the small-world effect, since to leading order the diameter of the configuration model is just .Components of finite size
As total number of vertices tends to infinity, the probability to find two giant components is vanishing. This means that in the sparse regime, the model consist of one giant component (if any) and multiple connected components of finite size. The sizes of the connected components are characterized by their size distribution - the probability that a randomly sampled vertex belongs to a connected component of size There is a correspondence between the degree distribution and the size distribution When total number of vertices tends to infinity, , the following relation takes place: : where and denotes the -fold convolution power. Moreover, explicit asymptotes for are known when and is close to zero. The analytical expressions for these asymptotes depend on finiteness of the moments of the degree distribution tail exponent (when features a heavy tail), and the sign of Molloy–Reed criterium. The following table summarises these relationships (the constants are provided in).Modelling
Comparison to real-world networks
Three general properties of complex networks are heterogeneous degree distribution, short average path length and high clustering. Having the opportunity to define any arbitrary degree sequence, the first condition can be satisfied by design, but as shown above, the global clustering coefficient is an inverse function of the network size, so for large configuration networks, clustering tends to be small. This feature of the baseline model contradicts the known properties of empirical networks, but extensions of the model can solve this issue (see ). All the networks generated by this model are locally tree-like provided the average of the excess degree distribution is either constant or grows slower than the square root of number of links, . In other words, this model prevents forming substructures such as loops in the large size limit. Vanishing of clustering coefficient, is a special case of this more general result. While the tree-like property makes the model not very realistic, so many calculations, such as the generating function methods, are possible for the configuration model thanks to this feature.Application: modularity calculation
The configuration model is applied as benchmark in the calculation of networkDirected configuration model
In the DCM (directed configuration model), each node is given a number of half-edges called tails and heads. Then tails and heads are matched uniformly at random to form directed edges. The size of the giant component, the typical distance, and the diameter of DCM have been studied mathematically. There also has been extensive research on random walks on DCM. Some real-world complex networks have been modelled by DCM, such as neural networks, finance and social networks.References
{{Reflist Networks