Making Sense of the Web’s Structure

Over the last 20 years, the World Wide Web has grown from a modest research project into an immense, seemingly chaotic repository of information and a novel, wide-ranging medium of communication.

According to Kleinberg and Lawrence, a hub is page that points to many authorities, whereas an authority is a page that is pointed to by many hubs (left). An alternative method to identify communities is to look for nodes for which the link density is greater among members than between members and the rest of the network (right). © Science

It’s also an intriguing type of network because it wasn’t explicitly engineered, unlike the electric power grid or telephone system of the past. The Web is the sum of billions of interconnected “pages,” created by the uncoordinated actions of tens of millions of individuals.

However, despite the decentralized way in which it has grown, the Web does have structure. Indeed, it displays a lot of self-organization.

Early studies suggested that the Web contained a large, strongly connected core, in which every page can reach virtually every other page via a string of hyperlinks. This core contained the Web’s most prominent sites. It was then possible to characterize the remaining pages in terms of their relation to the core.

Among the pioneering researchers to delve into the Web’s structure was computer scientist Jon Kleinberg of Cornell University. This month, Kleinberg received the prestigious Rolf Nevalinna prize, awarded every 4 years by the International Mathematical Union for major contributions to the mathematical aspects of information science. In 2005, he was named a MacArthur fellow.

In a 2001 paper in Science, Kleinberg and Steve Lawrence of the NEC Research Institute characterized the Web in terms of hubs and authorities or in terms of the density of links among pages to define communities.

“Analysis of the Web’s structure is leading to improved methods for accessing and understanding the available information, for example, through the design of better search engines, automatically compiled directories, focused search services, and content filtering tools,” Kleinberg and Lawrence wrote.

“The migration of communication and commerce to the Web is also altering information flow in the world,” they concluded. “We are only beginning to understand how link structure affects the visibility of Web sites. . . . But deeper analysis, exposing the structures of communities embedded in the Web, raises the prospect of bringing together individuals with common interests and lowering barriers to communication.”

In receiving the Nevanlinna prize, Kleinberg was cited for the development of the influential “hubs and authorities” algorithm, which ranks Web pages (nodes in a directed graph) by assigning an authority value and a hub value to each page.

He was also commended for developing methods to find short chains in large social networks, techniques for modeling, identifying, and analyzing bursts in data streams, and theoretical models of community growth in social networks.

Now, social networking via the Web is a burgeoning business and an international phenomenon, from MySpace and Facebook to Orkut and LinkedIn, providing much material for further study.

More Stories from Science News on Math