In a

connected Connected may refer to: Film and television * ''Connected'' (2008 film), a Hong Kong remake of the American movie ''Cellular'' * '' Connected: An Autoblogography About Love, Death & Technology'', a 2011 documentary film * ''Connected'' (2015 TV ...

graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...

, closeness centrality (or closeness) of a node is a measure of

centrality In graph theory and network analysis, indicators of centrality assign numbers or rankings to nodes within a graph corresponding to their network position. Applications include identifying the most influential person(s) in a social network, key ...

in a

network Network, networking and networked may refer to: Science and technology * Network theory, the study of graphs as a representation of relations between discrete objects * Network science, an academic field that studies complex networks Mathematics ...

, calculated as the reciprocal of the sum of the length of the shortest paths between the node and all other nodes in the graph. Thus, the more central a node is, the ''closer'' it is to all other nodes. Pathdegreeclosenessexampleedit

Closeness was defined by Bavelas (1950) as the

reciprocal Reciprocal may refer to: In mathematics * Multiplicative inverse, in mathematics, the number 1/''x'', which multiplied by ''x'' gives the product 1, also known as a ''reciprocal'' * Reciprocal polynomial, a polynomial obtained from another pol ...

of the farness, that is: :

C_B(x)= \frac.

where

d(y,x)

is the

distance Distance is a numerical or occasionally qualitative measurement of how far apart objects or points are. In physics or everyday usage, distance may refer to a physical length or an estimation based on other criteria (e.g. "two counties over"). ...

(length of the shortest path) between vertices

x

and

y

. This unnormalised version of closeness is sometimes known as status When speaking of closeness centrality, people usually refer to its normalized form which represents the average length of the shortest paths instead of their sum. It is generally given by the previous formula multiplied by

N-1

, where

N

is the number of nodes in the graph resulting in: :

C(x)= \frac.

The normalization of closeness simplifies the comparison of nodes in graphs of different sizes. For large graphs, the minus one in the normalisation becomes inconsequential and it is often dropped. As one of the oldest centrality measures, closeness is often given in general discussions of network centrality meaures in introductory texts or in articles comparing different centrality measures. The values produced by many centrality measaures can be highly correlated. In particular, closeness and

degree Degree may refer to: As a unit of measurement * Degree (angle), a unit of angle measurement ** Degree of geographical latitude ** Degree of geographical longitude * Degree symbol (°), a notation used in science, engineering, and mathematics ...

have been shown to be related in many networks through an approximate relationship :

\frac \approx \frac \ln(k_x) + \beta

where

k_x

is the degree of vertex

x

while

z

and β are parameters found by fitting closeness and degree to this formula. The ''z'' parameter represents the branching factor, the average degree of nodes (excluding the root node and leaves) of the shortest-path trees used to approximate networks when demonstrating this relationship. This is never an exact relationship but it captures a trend seen in many real-world networks. Closeness is related to other length scales used in network science. For instance, the average shortest path length

\langle \ell \rangle

, the average distance between vertices in a network, is simply the average of the inverse closeness values :

\langle \ell \rangle = \frac \sum_x \frac = \frac \sum_x \sum_y d(y,x)

. Taking distances ''from'' or ''to'' all other nodes is irrelevant in undirected graphs, whereas it can produce totally different results in

directed graph In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs. Definition In formal terms, a directed graph is an ordered pa ...

s (e.g. a website can have a high closeness centrality from outgoing links, but low closeness centrality from incoming links).

Applications

Closeness is used in many different contexts. In

bibliometrics Bibliometrics is the use of statistical methods to analyse books, articles and other publications, especially in regard with scientific contents. Bibliometric methods are frequently used in the field of library and information science. Bibliom ...

closeness has been used to look at the way academics choose their journals and bibliographies in different fields or to measure the impact of an author on a field and their social capital. When used to select potential leads in customer data, closeness has been seen to lead to a significant gain in the success rate. The closeness of a city in an air transport network has been shown to be highly correlated with socio-economic indicators such as gross regional domestic product. Closeness has also been applied to biological networks where, for instance, this was used to identify more than 50% of the global regulators within the top 2% of the ranked genes or essential genes were found to have higher closeness than nonessential genes in protein-interaction networks. In a metabolic network the closeness of nodes can identify the most important metabolites.

In disconnected graphs

When a graph is not

strongly connected In the mathematical theory of directed graphs, a graph is said to be strongly connected if every vertex is reachable from every other vertex. The strongly connected components of an arbitrary directed graph form a partition into subgraphs that ...

, Beauchamp introduced in 1965 the idea of using the sum of reciprocal of distances, instead of the reciprocal of the sum of distances, with the convention

1/\infty=0

: :

H(x)= \sum_\frac.

Beauchamp's modification follows the (much later in time) general principle proposed by Marchiori and Latora (2000) that in graphs with infinite distances the harmonic mean behaves better than the arithmetic mean. Indeed, Bavelas's closeness can be described as the denormalized reciprocal of the arithmetic mean of distances, whereas Beauchamp's centrality is the reciprocal of the harmonic mean of distances. This idea has resurfaced several time in the literature, often without the normalization factor

n-1

: for undirected graphs under the name valued centrality by Dekker (2005) and under the name harmonic centrality by Rochat (2009); if was axiomatized by Garg (2009) and proposed once again later by Opsahl (2010). It was studied on general directed graphs by Boldi and Vigna (2014). This idea is also quite similar to market potential proposed in Harris (1954) which now often goes by the term market access.

Variants

Dangalchev (2006), in a work on network vulnerability proposes for undirected graphs a different definition: :

D(x)=\sum_\frac.

This definition is used effectively for disconnected graphs and allows to create convenient formulae for graph operations. For example: If a graph

G_1 + G_2

is created by linking node

p

of graph

G_1

to node

q

of graph

G_2

then the combined closeness is: :

D(G_1+G_2) = D(G_1) + D(G_2) + (1+D(p))(1+D(q));

if a graph

G_1 + G_2

is created by collapsing node

p

of graph

G_1

and node

q

of graph

G_2

into one node then the closeness is: :

D(G_1+G_2) = D(G_1) + D(G_2) + 2 D(p) D(q).

If graph

T(G)

is the thorn graph of graph

G

, which has

n

nodes, then

T(G)

closeness is: :

D(T(G)) = \frac D(G) + n.

The natural generalization of this definition is: :

D(x)=\sum_\ ,

where

\alpha

belongs to (0,1). As

\alpha

increases from 0 to 1, the generalized closeness changes from local characteristic (degree) to global (number of connected nodes). The ''information centrality'' of Stephenson and Zelen (1989) is another closeness measure, which computes the harmonic mean of the resistance distances towards a vertex ''x'', which is smaller if ''x'' has many paths of small resistance connecting it to other vertices. In the classic definition of the closeness centrality, the spread of information is modeled by the use of shortest paths. This model might not be the most realistic for all types of communication scenarios. Thus, related definitions have been discussed to measure closeness, like the random walk closeness centrality introduced by Noh and Rieger (2004). It measures the speed with which randomly walking messages reach a vertex from elsewhere in the graph. Hierarchical closeness of Tran and Kwon (2014) is an extended closeness centrality to deal still in another way with the limitation of closeness in graphs that are not strongly connected. The hierarchical closeness explicitly includes information about the range of other nodes that can be affected by the given node.

References

{{Reflist Graph invariants Network analysis

Applications

In disconnected graphs

Variants

See also

References