In the mathematical field of

graph theory In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...

, the Erdős–Rényi model is either of two closely related models for generating random graphs or the

evolution of a random network Evolution of a random network is a dynamical process, usually leading to emergence of giant component accompanied with striking consequences on the network topology. To quantify this process, there is a need of inspection on how the size of the lar ...

. They are named after Hungarian mathematicians Paul Erdős and Alfréd Rényi, who first introduced one of the models in 1959, while Edgar Gilbert introduced the other model contemporaneously and independently of Erdős and Rényi. In the model of Erdős and Rényi, all graphs on a fixed vertex set with a fixed number of edges are equally likely; in the model introduced by Gilbert, also called the Erdős–Rényi–Gilbert model, each edge has a fixed probability of being present or absent, independently of the other edges. These models can be used in the probabilistic method to prove the existence of graphs satisfying various properties, or to provide a rigorous definition of what it means for a property to hold for

almost all In mathematics, the term "almost all" means "all but a negligible amount". More precisely, if X is a set, "almost all elements of X" means "all elements of X but those in a negligible subset of X". The meaning of "negligible" depends on the mathem ...

graphs.

Definition

There are two closely related variants of the Erdős–Rényi random graph model. Erdos generated network-p0

*In the

G(n,M)

model, a graph is chosen uniformly at random from the collection of all graphs which have

n

nodes and

M

edges. The nodes are considered to be labeled, meaning that graphs obtained from each other by permuting the vertices are considered to be distinct. For example, in the

G(3,2)

model, there are three two-edge graphs on three labeled vertices (one for each choice of the middle vertex in a two-edge path), and each of these three graphs is included with probability

\tfrac

. *In the

G(n,p)

model, a graph is constructed by connecting labeled nodes randomly. Each edge is included in the graph with probability

p

, independently from every other edge. Equivalently, the probability for generating each graph that has

n

nodes and

M

edges is

p^M (1-p)^.

The parameter

p

in this model can be thought of as a weighting function; as

p

increases from

0

1

, the model becomes more and more likely to include graphs with more edges and less and less likely to include graphs with fewer edges. In particular, the case

p=\tfrac

corresponds to the case where all

2^\binom

graphs on

n

vertices are chosen with equal probability. The behavior of random graphs are often studied in the case where

n

, the number of vertices, tends to infinity. Although

p

and

M

can be fixed in this case, they can also be functions depending on

n

. For example, the statement that almost every graph in

G(n,2\ln(n)/n)

is connected means that, as

n

tends to infinity, the probability that a graph on

n

vertices with edge probability

2\ln(n)/n

is connected tends to

1

Comparison between the two models

The expected number of edges in ''G''(''n'', ''p'') is

\tbinomp

, and by the

law of large numbers In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials sho ...

any graph in ''G''(''n'', ''p'') will almost surely have approximately this many edges (provided the expected number of edges tends to infinity). Therefore, a rough heuristic is that if ''pn''² → ∞, then ''G''(''n'',''p'') should behave similarly to ''G''(''n'', ''M'') with

M=\tbinom p

as ''n'' increases. For many graph properties, this is the case. If ''P'' is any graph property which is monotone with respect to the subgraph ordering (meaning that if ''A'' is a subgraph of ''B'' and ''A'' satisfies ''P'', then ''B'' will satisfy ''P'' as well), then the statements "''P'' holds for almost all graphs in ''G''(''n'', ''p'')" and "''P'' holds for almost all graphs in

G(n, \tbinom p)

" are equivalent (provided ''pn''² → ∞). For example, this holds if ''P'' is the property of being connected, or if ''P'' is the property of containing a

Hamiltonian cycle In the mathematical field of graph theory, a Hamiltonian path (or traceable path) is a path in an undirected or directed graph that visits each vertex exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a cycle that visits each vertex ...

. However, this will not necessarily hold for non-monotone properties (e.g. the property of having an even number of edges). In practice, the ''G''(''n'', ''p'') model is the one more commonly used today, in part due to the ease of analysis allowed by the independence of the edges.

Properties of ''G''(''n'', ''p'')

With the notation above, a graph in ''G''(''n'', ''p'') has on average

\tbinom p

edges. The distribution of the

degree Degree may refer to: As a unit of measurement * Degree (angle), a unit of angle measurement ** Degree of geographical latitude ** Degree of geographical longitude * Degree symbol (°), a notation used in science, engineering, and mathemati ...

of any particular vertex is binomial: :

P(\deg(v) = k) = p^k(1-p)^,

where ''n'' is the total number of vertices in the graph. Since :

P(\deg(v) = k) \to \frac \quad \text n \to \infty \text np = \text,

this distribution is Poisson for large ''n'' and ''np'' = const. In a 1960 paper, Erdős and Rényi The probability ''p'' used here refers there to

N(n) = \tbinom p

described the behavior of ''G''(''n'', ''p'') very precisely for various values of ''p''. Their results included that: *If ''np'' < 1, then a graph in ''G''(''n'', ''p'') will almost surely have no connected components of size larger than O(log(''n'')). *If ''np'' = 1, then a graph in ''G''(''n'', ''p'') will almost surely have a largest component whose size is of order ''n''^2/3. *If ''np'' → ''c'' > 1, where ''c'' is a constant, then a graph in ''G''(''n'', ''p'') will almost surely have a unique

giant component In network theory, a giant component is a connected component of a given random graph that contains a finite fraction of the entire graph's vertices. Giant component in Erdős–Rényi model Giant components are a prominent feature of the Erdő ...

containing a positive fraction of the vertices. No other component will contain more than O(log(''n'')) vertices. *If

p<\tfrac n

, then a graph in ''G''(''n'', ''p'') will almost surely contain isolated vertices, and thus be disconnected. *If

p>\tfrac n

, then a graph in ''G''(''n'', ''p'') will almost surely be connected. Thus

\tfrac n

is a sharp threshold for the connectedness of ''G''(''n'', ''p''). Further properties of the graph can be described almost precisely as ''n'' tends to infinity. For example, there is a ''k''(''n'') (approximately equal to 2log₂(''n'')) such that the largest

clique A clique ( AusE, CanE, or ), in the social sciences, is a group of individuals who interact with one another and share similar interests. Interacting with cliques is part of normative social development regardless of gender, ethnicity, or popula ...

in ''G''(''n'', 0.5) has almost surely either size ''k''(''n'') or ''k''(''n'') + 1. Thus, even though finding the size of the largest clique in a graph is

NP-complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryin ...

, the size of the largest clique in a "typical" graph (according to this model) is very well understood. Edge-dual graphs of Erdos-Renyi graphs are graphs with nearly the same degree distribution, but with degree correlations and a significantly higher

clustering coefficient In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups ...

Relation to percolation

percolation theory In statistical physics and mathematics, percolation theory describes the behavior of a network when nodes or links are added. This is a geometric type of phase transition, since at a critical fraction of addition the network of small, disconnecte ...

one examines a finite or infinite graph and removes edges (or links) randomly. Thus the Erdős–Rényi process is in fact unweighted link percolation on the

complete graph In the mathematical field of graph theory, a complete graph is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge. A complete digraph is a directed graph in which every pair of distinct vertices ...

. (One refers to percolation in which nodes and/or links are removed with heterogeneous weights as weighted percolation). As percolation theory has much of its roots in

physics Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which rel ...

, much of the research done was on the lattices in Euclidean spaces. The transition at ''np'' = 1 from giant component to small component has analogs for these graphs, but for lattices the transition point is difficult to determine. Physicists often refer to study of the complete graph as a

mean field theory In physics and probability theory, Mean-field theory (MFT) or Self-consistent field theory studies the behavior of high-dimensional random ( stochastic) models by studying a simpler model that approximates the original by averaging over degrees of ...

. Thus the Erdős–Rényi process is the mean-field case of percolation. Some significant work was also done on percolation on random graphs. From a physicist's point of view this would still be a mean-field model, so the justification of the research is often formulated in terms of the robustness of the graph, viewed as a communication network. Given a random graph of ''n'' ≫ 1 nodes with an average degree

\langle k \rangle

. Remove randomly a fraction

1 - p'

of nodes and leave only a fraction

p'

from the network. There exists a critical percolation threshold

p'_c=\tfrac

below which the network becomes fragmented while above

p'_c

a giant connected component of order ''n'' exists. The relative size of the giant component, ''P''_∞, is given by :

P_\infty= p' -\exp(-\langle k \rangle P_\infty) \,

Caveats

Both of the two major assumptions of the ''G''(''n'', ''p'') model (that edges are independent and that each edge is equally likely) may be inappropriate for modeling certain real-life phenomena. Erdős–Rényi graphs have low clustering, unlike many social networks. Some modeling alternatives include Barabási–Albert model and Watts and Strogatz model. These alternative models are not percolation processes, but instead represent a growth and rewiring model, respectively.

History

The ''G''(''n'', ''p'') model was first introduced by Edgar Gilbert in a 1959 paper studying the connectivity threshold mentioned above. The ''G''(''n'', ''M'') model was introduced by Erdős and Rényi in their 1959 paper. As with Gilbert, their first investigations were as to the connectivity of ''G''(''n'', ''M''), with the more detailed analysis following in 1960.

Continuum limit representation of critical $G(n,p)$

A continuum limit of the graph was obtained when

p

is of order

1/n

. Specifically, consider the sequence of graphs

G_n:=G(n,1/n+\lambda n^)

for

\lambda\in \R

. The limit object can be constructed as follows: * First, generate a diffusion

W^\lambda(t):=W(t)+\lambda t - \frac

where

W

is a

standard Brownian motion In mathematics, the Wiener process is a real-valued continuous-time stochastic process named in honor of American mathematician Norbert Wiener for his investigations on the mathematical properties of the one-dimensional Brownian motion. It is o ...

. * From this process, we define the reflected process

R^\lambda(t):= W^\lambda(t) - \inf\limits_W^\lambda(s)

. This process can be seen as containing many successive excursion (not quite a Brownian excursion, see ). Because the drift of

W^\lambda

is dominated by

-\frac

, these excursions become shorter and shorter as

t\to +\infty

. In particular, they can be sorted in order of decreasing lengths: we can partition

\R

into intervals

(C_i)_

of decreasing lengths such that

R^\lambda

restricted to

C_i

is a Brownian excursion for any

i\in \N

. * Now, consider an excursion

(e(s))_

. Construct a random graph as follows: Brownian excursion with leaf and internal node identified

** Construct a real tree

T_e

(see Brownian tree). ** Consider a Poisson point process

\Xi

times \R_+

with unit intensity. To each point

(x,s)\in \Xi

such that

x\leq e(s)

, corresponds an underlying internal node and a leaf of the tree

T_e

. Identifying the two vertices, the tree

T_e

becomes a graph

\Gamma_e

Applying this procedure, one obtains a sequence of random infinite graphs of decreasing sizes:

(\Gamma_i)_

. The theorem states that this graph corresponds in a certain sense to the limit object of

G_n

n\to+\infty

References

Literature

* *

Weblinks

Video: Erdos-Renyi Random Graph
{{DEFAULTSORT:Erdos-Renyi model Random graphs Renyi model