In the study of

graphs Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...

and

networks Network, networking and networked may refer to: Science and technology * Network theory, the study of graphs as a representation of relations between discrete objects * Network science, an academic field that studies complex networks Mathematics ...

, the

degree Degree may refer to: As a unit of measurement * Degree (angle), a unit of angle measurement ** Degree of geographical latitude ** Degree of geographical longitude * Degree symbol (°), a notation used in science, engineering, and mathematics ...

of a node in a network is the number of connections it has to other nodes and the degree distribution is the probability distribution of these degrees over the whole network.

Definition

The

of a node in a network (sometimes referred to incorrectly as the

connectivity Connectivity may refer to: Computing and technology * Connectivity (media), the ability of the social media to accumulate economic capital from the users connections and activities * Internet connectivity, the means by which individual terminal ...

) is the number of connections or edges the node has to other nodes. If a network is

directed Director may refer to: Literature * ''Director'' (magazine), a British magazine * ''The Director'' (novel), a 1971 novel by Henry Denker * ''The Director'' (play), a 2000 play by Nancy Hasty Music * Director (band), an Irish rock band * ''D ...

, meaning that edges point in one direction from one node to another node, then nodes have two different degrees, the in-degree, which is the number of incoming edges, and the out-degree, which is the number of outgoing edges. The degree distribution ''P''(''k'') of a network is then defined to be the fraction of nodes in the network with degree ''k''. Thus if there are ''n'' nodes in total in a network and ''n''_''k'' of them have degree ''k'', we have

P(k) = \frac

. The same information is also sometimes presented in the form of a ''cumulative degree distribution'', the fraction of nodes with degree smaller than ''k'', or even the ''complementary cumulative degree distribution'', the fraction of nodes with degree greater than or equal to ''k'' (1 - ''C'') if one considers ''C'' as the ''cumulative degree distribution''; i.e. the complement of ''C''.

Observed degree distributions

The degree distribution is very important in studying both real networks, such as the

Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...

and

social networks A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for a ...

, and theoretical networks. The simplest network model, for example, the (Erdős–Rényi model)

random graph In mathematics, random graph is the general term to refer to probability distributions over graphs. Random graphs may be described simply by a probability distribution, or by a random process which generates them. The theory of random graphs ...

, in which each of ''n'' nodes is independently connected (or not) with probability ''p'' (or 1 − ''p''), has a binomial distribution of degrees ''k'': :

P(k) =  p^k (1 - p)^,

(or Poisson in the limit of large ''n'', if the average degree

\langle k\rangle=p(n-1)

is held fixed). Most networks in the real world, however, have degree distributions very different from this. Most are highly right-skewed, meaning that a large majority of nodes have low degree but a small number, known as "hubs", have high degree. Some networks, notably the Internet, the

World Wide Web The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet. Documents and downloadable media are made available to the network through web ...

, and some social networks were argued to have degree distributions that approximately follow a power law:

P(k)\sim k^

, where ''γ'' is a constant. Such networks are called

scale-free networks A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction ''P''(''k'') of nodes in the network having ''k'' connections to other nodes goes for large values of ''k'' as : P(k ...

and have attracted particular attention for their structural and dynamical properties. However, a survey of a wide range of real world networks suggests that scale-free networks are rare when assessed using statistically rigorous measures. Some researchers have disputed these findings arguing that the definitions used in the study are inappropriately strict, while others have argued that the precise functional form of the degree distribution is less important than knowing whether the degree distribution is fat-tailed or not. The over-interpretation of specific forms of the degree distribution has also been criticised for failing to consider how networks may evolve over time.

Excess degree distribution

Excess degree distribution is the probability distribution, for a node reached by following an edge, of the number of other edges attached to that node. In other words, it is the distribution of outgoing links from a node reached by following a link. Suppose a network has a degree distribution

P(k)

, by selecting one node (randomly or not) and going to one of its neighbors (assuming to have one neighbor at least), then the probability of that node to have

k

neighbors is not given by

P(k)

. The reason is that, whenever some node is selected in a heterogeneous network, it is more probable to reach the hubs by following one of the existing neighbors of that node. The true probability of such nodes to have degree

k

q(k)

which is called the ''excess degree'' of that node. In the

configuration model In network science, the configuration model is a method for generating random networks from a given degree sequence. It is widely used as a reference model for real-life social networks, because it allows the modeler to incorporate arbitrary degre ...

, which correlations between the nodes have been ignored and every node is assumed to be connected to any other nodes in the network with the same probability, the excess degree distribution can be found as:

q(k) =  \fracP(k+1),

where

is the mean-degree (average degree) of the model. It follows to that fact that the average degree of the neighbor of any node is greater than the average degree of that node. In social networks, it mean that your friends, on average, have more friends than you. This is famous as the friendship paradox. It can be shown that a network can have a

giant component In network theory, a giant component is a connected component of a given random graph that contains a finite fraction of the entire graph's vertices. Giant component in Erdős–Rényi model Giant components are a prominent feature of the Erd� ...

, if its average excess degree is larger than one:

\sum_k kq(k) > 1 \Rightarrow   / - 1 >1 \Rightarrow  -2>0

Bear in mind that the last two equations are just for the

and to derive the excess degree distribution of a real-word network, we should also add degree correlations into account.

The Generating Functions Method

Generating functions In mathematics, a generating function is a way of encoding an infinite sequence of numbers () by treating them as the coefficients of a formal power series. This series is called the generating function of the sequence. Unlike an ordinary series ...

can be used to calculate different properties of random networks. Given the degree distribution and the excess degree distribution of some network,

P(k)

and

q(k)

respectively, it is possible to write two power series in the following forms:

G_0(x) = \textstyle \sum_ \displaystyle P(k)x^k

and

G_1(x) = \textstyle \sum_ \displaystyle q(k)x^k = \textstyle \sum_ \displaystyle \fracP(k)x^

G_1(x)

can also be obtained from derivatives of

G_0(x)

G_1(x) = \frac

If we know the generating function for a probability distribution

P(k)

then we can recover the values of

P(k)

by differentiating:

P(k) = \frac \biggl \vert _

Some properties, e.g. the moments, can be easily calculated from

G_0(x)

and its derivatives: *

= G'_0(1)

= G''_0(1) + G'_0(1)

And in general: *

= \Biggl_0(x)\Biggl

For Poisson-distributed random networks, such as the ER graph,

G_1(x) = G_0(x)

, that is the reason why the theory of random networks of this type is especially simple. The probability distributions for the 1st and 2nd-nearest neighbors are generated by the functions

G_0(x)

and

G_0(G_1(x))

. By extension, the distribution of

m

-th neighbors is generated by:

G_0\bigl(G_1(...G_1(x)...)\bigr)

, with

m-1

iterations of the function

G_1

acting on itself. The average number of 1st neighbors,

c_1

, is

= , _

and the average number of 2nd neighbors is:

= G_1'(1)G'_0\big(G_1(1)\big) = G_1'(1)G'_0(1) = G''_0(1)

Degree distribution for directed networks

In a directed network, each node has some in-degree

k_

and some out-degree

k_

which are the number of links which have run into and out of that node respectfully. If

P(k_, k_)

is the probability that a randomly chosen node has in-degree

k_

and out-degree

k_

then the generating function assigned to this joint probability distribution can be written with two valuables

x

and

y

as:

\mathcal(x,y) =  \sum_ \displaystyle P()x^y^ .

Since every link in a directed network must leave some node and enter another, the net average number of links entering a node is zero. Therefore,

\langle\rangle =\sum_ \displaystyle (k_-k_)P() = 0

, which implies that, the generation function must satisfy:

\vert _ =  \vert _ = c,

where

c

is the mean degree (both in and out) of the nodes in the network;

\langle\rangle = \langle\rangle = c.

Using the function

\mathcal(x,y)

, we can again find the generation function for the in/out-degree distribution and in/out-excess degree distribution, as before.

G^_0(x)

can be defined as generating functions for the number of arriving links at a randomly chosen node, and

G^_1(x)

can be defined as the number of arriving links at a node reached by following a randomly chosen link. We can also define generating functions

G^_0(y)

and

G^_1(y)

for the number leaving such a node: *

G^_0(x) = \mathcal(x,1)

G^_1(x) =  \frac \vert _

G^_0(y) = \mathcal(1,y)

G^_1(y) =  \frac \vert _

Here, the average number of 1st neighbors,

c

, or as previously introduced as

c_1

, is

\biggl \vert _ =  \biggl \vert _

and the average number of 2nd neighbors reachable from a randomly chosen node is given by:

c_2 = G_1'(1)G'_0(1) =\biggl \vert _

. These are also the numbers of 1st and 2nd neighbors from which a random node can be reached, since these equations are manifestly symmetric in

x

and

y

Degree distribution for signed networks

In a signed network, each node has a positive-degree

k_

and a negative degree

k_

which are the positive number of links and negative number of links connected to that node respectfully. So

P(k_)

and

P(k_)

denote negative degree distribution and positive degree distribution of the signed network.

References

* * * {{cite journal , last=Newman , first=M. E. J. , title=The structure and function of complex networks , journal=SIAM Review , volume=45 , pages=167–256 , year=2003 , doi=10.1137/S003614450342480 , issue=2 , arxiv=cond-mat/0303516 , bibcode=2003SIAMR..45..167N Graph theory Graph invariants Network theory