HOME

TheInfoList



OR:

The webgraph describes the directed links between pages of the
World Wide Web The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...
. A
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discret ...
, in general, consists of several vertices, some pairs connected by edges. In a directed graph, edges are directed lines or arcs. The webgraph is a directed graph, whose vertices correspond to the pages of the WWW, and a directed edge connects page X to page Y if there exists a
hyperlink In computing, a hyperlink, or simply a link, is a digital reference providing direct access to Data (computing), data by a user (computing), user's point and click, clicking or touchscreen, tapping. A hyperlink points to a whole document or to ...
on page X, referring to page Y.


Properties

* The degree distribution of the webgraph strongly differs from the degree distribution of the classical random graph model, the
Erdős–Rényi model In the mathematical field of graph theory, the Erdős–Rényi model refers to one of two closely related models for generating random graphs or the evolution of a random network. These models are named after Hungarians, Hungarian mathematicians ...
: in the Erdős–Rényi model, there are very few large degree nodes, relative to the webgraph's degree distribution. The precise distribution is unclear, however: it is relatively well described by a
lognormal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normal distribution, normally distributed. Thus, if the random variable is log-normally distributed ...
distribution, as well as the
Barabási–Albert model The Barabási–Albert (BA) model is an algorithm for generating random scale-free network, scale-free complex network, networks using a preferential attachment mechanism. Several natural and human-made systems, including the Internet, the Worl ...
for power laws. * The webgraph is an example of a
scale-free network A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction ''P''(''k'') of nodes in the network having ''k'' connections to other nodes goes for large values of ''k'' as : P( ...
.


Applications

The webgraph is used for: * computing the
PageRank PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. Accordin ...
of the world wide web's pages; * computing the personalized PageRank; * detecting webpages of similar topics, through graph-theoretical properties only, like co-citation; * and identifying hubs and authorities in the web for
HITS algorithm Hyperlink-Induced Topic Search (HITS; also known as hubs and authorities) is a link analysis algorithm that rates Web pages, developed by Jon Kleinberg. The idea behind Hubs and Authorities stemmed from a particular insight into the creation of w ...
.


References

{{Reflist


External links


Webgraphs in Yahoo Sandbox

Webgraphs at University of Milano – Laboratory for Web Algorithmics

Webgraphs at Stanford – SNAP

Webgraph at the Erdős Webgraph Server

Web Data Commons - Hyperlink Graph
Internet search algorithms Application-specific graphs