PageRank Computations

Meaning: The likelihood that a surfer, clicking randomly will arrive at any particular page. Also called the Random Surfer Model: What is the probability that a random surfer will land up on a page?

It is an algorithm used to assign relative importance to nodes of a graph, each of them having a set of inward and outward edges (like the web). PageRank is patented to Stanford University, while Google has the rights to exclusive use. These rights were bought by Google in 2005 for $336 million.

Essentially, if Page A links to Page B, Google counts it as a "vote'' for Page B.

BUT not at votes are equal! How important is the page voting for you? If my webpage has a link to your webpage, that is less important than nytimes.com having a link to your webpage! NYTimes transfers some of its "authority" to you.

How does NYTimes "distribute" its authority among the pages it links to? It just divides its own authority (or PageRank) equally among the pages it links to.

Consider the web site represented by the following graph, representing a set of web pages and the links interconnecting them:

Simplified Algorithm

Teleporting Algorithm

We saw that the "non-teleporting" algorithm for computing page-ranks can have problems in certain situations. To avoid these problems, google is using a "teleporting" algorithm for page-rank computations. In this way, 85% of the page-rank for any page is computed as in the non-teleporting way, while the rest 15% of the rank is coming from re-distributing the rank of the system. Intuitevely, in our "water-flow" model, we can think of that second 15% of the rank, as if it was water evaporating from the system, and then coming down to it again, equally in every node, as rain.

In the teporting way, the system of equations for the initial graph, becomes:

PR(A) = 0.85(PR(W)/2 + PR(B)/1 + PR(D)/1) + 0.15/5

PR(B) = 0.85(PR(A)/3) + 0.15/5

PR(C) = 0.85(PR(A)/3) + 0.15/5

PR(D) = 0.85(PR(A)/3 + PR(C)) + 0.15/5

PR(W) = 0.85(PR(W)/2) + 0.15/5