The World Wide Web is less like a network of heavily interconnected superhighways and more like a jungle of one-way streets often leading to dead ends, researchers now report.
Previous studies had suggested a much higher degree of linking between Web sites. One analysis, for instance, had estimated that two randomly chosen pages on the Web are, on average, only 19 clicks away from each other (SN: 9/25/99, p. 203).
In developing what they describe as the first comprehensive “map” of the World Wide Web, scientists at the IBM Almaden Research Center in San Jose, Calif., and their collaborators explored about 200 million Web pages and checked 1.5 billion links from one page to another.
They discovered that four distinct types of pages make up the Web. About one-third of all Web sites constitute a strongly connected core, within which numerous links allow Web surfers to travel easily. Another one-quarter of the Web’s pages send users to core pages, but the core pages don’t send users to them. A quarter of pages can be accessed from the core but don’t link back. The remainder of the Web is made up of “orphan” pages that are not accessible from the connected core or linked to it, the IBM team says.
“Our study indicates that the macroscopic structure of the Web is considerably more intricate than suggested by earlier experiments on a smaller scale,” the researchers conclude. They reported their findings earlier this month in Amsterdam at the Ninth International World Wide Web Conference.