Online comments maybe not total waste of time

Conversations on news sites show how information and ideas spread

There’s a science behind the comments on websites.

It’s actually quite predictable how much chatter a post on Slashdot or Wikipedia will attract, according to a new study of several websites with­ large user bases. And the thread of an online conversation — whether it sticks to the original topic or users comment on each other’s comments — can be modeled as a tree with discussions veering off on branches, researchers report online November 2 at arXiv.org.

The findings give hope to social scientists trying to understand broader phenomena, like how rumors about a candidate spread during a campaign or how information about street protests flows out of a country with state-controlled media.

“The fact we have good fits allows us to think people react to news in a universal way,” said study coauthor Vincenç Gómez of Radboud University in the Netherlands.

It’s tough to track information traveling by word of mouth. Studies that tried to track how chain letters propagate by email suffered from selection bias because they included only posts that had been “fossilized” on newsgroup archives. But online comments are a treasure trove of complete data trees.

“They provide a test tube where you can see everything,” says Stanford economist Ben Golub, who was not involved in the research. When people talk in person, “you can’t see if someone talked to their sister about whether Obama is an honest guy. Online, we see all the communication that happens.”

Researchers collected millions of comments in discussion boards from four websites: Wikipedia and the news aggregators Slashdot, Menéame (the Spanish version of Digg) and Barrapunto (the Spanish version of Slashdot). Slashdot (with the tagline “News for Nerds”) and its twin Barrapunto publish short news posts and allow users to comment on each story and on each others’ comments, giving conversations a “nested-tree” structure. Discussion pages linked to Wikipedia articles also have that structure, and a similar tree can be extracted from Menéame using comments tagged as responses to other comments.

Researchers sampled 50,000 comments at random from each dataset and analyzed whether each one was a reply to a news post or to another comment, and how many were replies to discussions that already had a crowd of comments.

The team found that in Menéame, the discussion tree is skinny — users tend to post directly in response to the news story and not to talk amongst themselves. But on Slashdot, a comment will often provoke a new thread of conversation — a tree with more branches. One can infer that Slashdot users leave more thoughtful comments than Menéame and Digg users do, stimulating more in-depth discussion, Gomez says.

“What we can say with Slashdot, there is rich structure in the comments,” says Gomez. “For Menéame, you mostly see people leave their opinion there and they don’t get answers.”

In general on these sites, the more comments a post already has, the more comments people will continue to post — a “popularity bias”. But the opposite was true in Wikipedia.

Once a discussion board has more than one comment, the chances of people continuing to comment falls. That reflects the fact that Wikipedia is goal-oriented, Golub says: once an issue has been addressed, that’s the end of the conversation.

The importance of the research is not that it reveals the psychology of Internet users, but that it’s possible to write down a simple model that captures how Internet users spread information.

“Some people say these models won’t work because people are just so complicated,” Golub says. “While that’s true, their behavior exhibits simple patterns.”

Finding patterns in communication doesn’t just show how funny captions about photos of cats leak into the public vernacular, Golub says. It’s also useful for social scientists interested in questions such as how people look to their coworkers to decide on a 401(k) plan, or how new technology spreads in developing countries. For example, how quickly will farmers adopt new technology if their neighbors are talking about it?

“There might be a hybrid seed that is better, but they need to see proof it’s better because it’s a life-or-death decision,” Golub says.