There are several fundamental inventions that have shaped the formation of the Internet business models as we know them today. There was the selling of banner ads (credited to HotWired back in 1994), the keyword auction for displaying ads (invented by Goto.com in 1999), and there is Google’s algorithm for ranking search results, also known as PageRank (described in US Patent 6,285,999).
This patent was the foundation for Google, and enabled them to differentiate themselves from other well-established search engines at that time. So, given its significance, I thought it was time that I got around to reading it, which I did this weekend.
Search engines, such as Google’s, “crawl” the web, grabbing copies of all the web pages that they can find, and following the links within them to find more web pages. Then they create an enormous index of all the information within the web pages. So, when you type in some keywords to search for, they look them up in the index, to find all possible matches, and then rank and order those matches such that the most likely ones appear in the first page of results. The PageRank algorithm supplies this ranking.
Essentially their algorithm produces a scaled version of the estimated probability of a web surfer ending up on a given page. If one page is better linked-to than another page (based on the number of links from other well-linked-to pages), it will gain a higher ranking. They describe how this can be estimated through iteratively multiplying a probability matrix with itself.
As I was reading this, I recalled a discussion that I had back in the late 90s with my then-housemate Brendan. We were discussing a reputation database, where people would recommend others who they respected, based I think on a concept in David Brin’s book Earth. The solution to calculating these reputations was pretty much the same as Google’s method for PageRank. I’m not saying this to big-note myself, just to point out that as neither Brendan nor I had a PhD in database algorithms and since it took us 5 minutes to think up the solution, the algorithm is hardly rocket science.
Since then, Google’s gone on to greatness, and to produce many other patents. Today, PageRank is considered to be just one of hundreds of factors that go into ranking their results. However, it’s interesting to see how a simple invention (and a lot of hard work from talented people!) has been the basis for one of the most respected global companies.