Google matrix
   HOME

TheInfoList



OR:

A Google matrix is a particular
stochastic matrix In mathematics, a stochastic matrix is a square matrix used to describe the transitions of a Markov chain. Each of its entries is a nonnegative real number representing a probability. It is also called a probability matrix, transition matrix, ...
that is used by
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
's
PageRank PageRank (PR) is an algorithm used by Google Search to rank webpages, web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. A ...
algorithm. The matrix represents a graph with edges representing links between pages. The PageRank of each page can then be generated iteratively from the Google matrix using the power method. However, in order for the power method to converge, the matrix must be stochastic, irreducible and
aperiodic A periodic function is a function that repeats its values at regular intervals. For example, the trigonometric functions, which repeat at intervals of 2\pi radians, are periodic functions. Periodic functions are used throughout science to desc ...
.


Adjacency matrix ''A'' and Markov matrix ''S''

In order to generate the Google matrix ''G'', we must first generate an
adjacency matrix In graph theory and computer science, an adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. In the special case of a finite simp ...
''A'' which represents the relations between pages or nodes. Assuming there are ''N'' pages, we can fill out ''A'' by doing the following: # A matrix element A_ is filled with 1 if node j has a link to node i, and 0 otherwise; this is the adjacency matrix of links. # A related matrix ''S'' corresponding to the transitions in a
Markov chain A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happen ...
of given network is constructed from ''A'' by dividing the elements of column "j" by a number of k_j=\Sigma_^N A_ where k_j is the total number of outgoing links from node ''j'' to all other nodes. The columns having zero matrix elements, corresponding to dangling nodes, are replaced by a constant value ''1/N''. Such a procedure adds a link from every sink, dangling state a to every other node. # Now by the construction the sum of all elements in any column of matrix ''S'' is equal to unity. In this way the matrix ''S'' is mathematically well defined and it belongs to the class of Markov chains and the class of Perron-Frobenius operators. That makes ''S'' suitable for the
PageRank PageRank (PR) is an algorithm used by Google Search to rank webpages, web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. A ...
algorithm.


Construction of Google matrix ''G''

Then the final Google matrix G can be expressed via ''S'' as: : G_ = \alpha S_ + (1-\alpha) \frac \;\;\;\;\;\;\;\;\;\;\; (1) By the construction the sum of all non-negative elements inside each matrix column is equal to unity. The numerical coefficient \alpha is known as a damping factor. Usually ''S'' is a sparse matrix and for modern directed networks it has only about ten nonzero elements in a line or column, thus only about 10''N'' multiplications are needed to multiply a vector by matrix ''G''.


Examples of Google matrix

An example of the matrix S construction via Eq.(1) within a simple network is given in the article
CheiRank The CheiRank is an eigenvector with a maximal real eigenvalue of the Google matrix G^* constructed for a directed network with the inverted directions of links. It is similar to the PageRank vector, which ranks the network nodes in average p ...
. For the actual matrix, Google uses a damping factor \alpha around 0.85. The term (1-\alpha) gives a surfer probability to jump randomly on any page. The matrix G belongs to the class of Perron-Frobenius operators of Markov chains. The examples of Google matrix structure are shown in Fig.1 for Wikipedia articles hyperlink network in 2009 at small scale and in Fig.2 for University of Cambridge network in 2006 at large scale.


Spectrum and eigenstates of ''G'' matrix

For 0 < \alpha < 1 there is only one maximal eigenvalue \lambda =1 with the corresponding right eigenvector which has non-negative elements P_i which can be viewed as stationary probability distribution. These probabilities ordered by their decreasing values give the PageRank vector P_i with the PageRank K_i used by Google search to rank webpages. Usually one has for the World Wide Web that P \propto 1/K^ with \beta \approx 0.9 . The number of nodes with a given PageRank value scales as N_P \propto 1/P^\nu with the exponent \nu = 1+1/\beta \approx 2.1 . The left eigenvector at \lambda =1 has constant matrix elements. With 0 < \alpha all eigenvalues move as \lambda_i \rightarrow \alpha \lambda_i except the maximal eigenvalue \lambda =1, which remains unchanged. The PageRank vector varies with \alpha but other eigenvectors with \lambda_i < 1 remain unchanged due to their orthogonality to the constant left vector at \lambda = 1 . The gap between \lambda = 1 and other eigenvalue being 1 - \alpha \approx 0.15 gives a rapid convergence of a random initial vector to the PageRank approximately after 50 multiplications on G matrix. At \alpha=1 the matrix G has generally many degenerate eigenvalues \lambda =1 (see e.g. ref name="Georgeot2010"/>). Examples of the eigenvalue spectrum of the Google matrix of various directed networks is shown in Fig.3 from and Fig.4 from. The Google matrix can be also constructed for the Ulam networks generated by the Ulam method for dynamical maps. The spectral properties of such matrices are discussed in ,10,11,12,13,15 In a number of cases the spectrum is described by the fractal Weyl law 0,12 The Google matrix can be constructed also for other directed networks, e.g. for the procedure call network of the Linux Kernel software introduced in 5 In this case the spectrum of \lambda is described by the fractal Weyl law with the fractal dimension d \approx 1.3 (see Fig.5 from ). Numerical analysis shows that the eigenstates of matrix G are localized (see Fig.6 from ).
Arnoldi iteration In numerical linear algebra, the Arnoldi iteration is an eigenvalue algorithm and an important example of an iterative method. Arnoldi finds an approximation to the eigenvalues and eigenvectors of general (possibly non- Hermitian) matrices by c ...
method allows to compute many eigenvalues and eigenvectors for matrices of rather large size 3 Other examples of G matrix include the Google matrix of brain 7and business process management 8 see also. Applications of Google matrix analysis to DNA sequences is described in 0 Such a Google matrix approach allows also to analyze entanglement of cultures via ranking of multilingual Wikipedia articles abouts persons 1


Historical notes

The Google matrix with damping factor was described by
Sergey Brin Sergey Mikhailovich Brin (russian: link=no, Сергей Михайлович Брин; born August 21, 1973) is an American business magnate, computer scientist, and internet entrepreneur, who co-founded Google with Larry Page. Brin was th ...
and
Larry Page Lawrence Edward Page (born March 26, 1973) is an American business magnate, computer scientist and internet entrepreneur. He is best known for co-founding Google with Sergey Brin. Page was the chief executive officer of Google from 1997 unti ...
in 1998 2 see also articles on PageRank history 3 4


See also

*
CheiRank The CheiRank is an eigenvector with a maximal real eigenvalue of the Google matrix G^* constructed for a directed network with the inverted directions of links. It is similar to the PageRank vector, which ranks the network nodes in average p ...
*
Arnoldi iteration In numerical linear algebra, the Arnoldi iteration is an eigenvalue algorithm and an important example of an iterative method. Arnoldi finds an approximation to the eigenvalues and eigenvectors of general (possibly non- Hermitian) matrices by c ...
*
Markov chain A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happen ...
*
Transfer operator Transfer may refer to: Arts and media * ''Transfer'' (2010 film), a German science-fiction movie directed by Damir Lukacevic and starring Zana Marjanović * ''Transfer'' (1966 film), a short film * ''Transfer'' (journal), in management studies ...
*
Perron–Frobenius theorem In matrix theory, the Perron–Frobenius theorem, proved by and , asserts that a real square matrix with positive entries has a unique largest real eigenvalue and that the corresponding eigenvector can be chosen to have strictly positive compon ...
*
Web search engines A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...


References

* * * * * * * * * * * * * * * {{refend


External links


Google matrix at Scholarpedia

Google PR Shut Down

Video lectures at IHES Workshop "Google matrix: fundamental, applications and beyond", Oct 2018
matrix Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** '' The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchi ...
Link analysis Markov models