Name Voyager

Emily, Emma, Madison, Olivia, Hannah, Abigail, Isabella, Ashley, Samantha, and Elizabeth.

Jacob, Michael, Joshua, Matthew, Ethan, Andrew, Daniel, William, Joseph, and Christopher.

Recognize these names? According to the Social Security Administration (SSA), they were the most popular names given to babies in the United States in 2004. See http://www.ssa.gov/OACT/babynames/. The SSA database also lists the 1,000 most popular baby names for each year going back to 1880. You can track, for example, how the popularity of a name has changed over time.

The same data also appear at another Web site (http://www.babynamewizard.com/), but in a strikingly different form. Associated with a book titled The Baby Name Wizard by Laura Wattenberg, the Web site includes a visualization that makes exploring the data and looking for trends both fun and illuminating. The book itself provides all sorts of information about different names to help prospective parents find something suitable for their offspring.

Called NameVoyager, the Java program allows a visitor to zoom in on particular names and track their popularity over the years (http://www.babynamewizard.com/namevoyager/lnv0105.html).

The opening screen displays nearly 5,000 names, each one represented by a colored stripe (blue for male names, pink for female names). A stripe’s width is proportional to the name’s frequency (per million babies). Each segment of a stripe represents a particular name’s popularity in a given decade. Moving the cursor to any point in the display highlights a name, giving its popularity in a certain decade. You can readily find out, for example, that John ranked third in the 1930s.

By typing in the letters of a name, you can then isolate a particular stripe to see the associated name’s rise and fall in popularity over the decades. Eric, for instance, ranked 545 in the 1880s, 417 in the 1890s, 418 in the 1900s, 423 in the 1910s, 422 in the 1920s, 331 in the 1930s, 151 in the 1940s, 74 in the 1950s, 28 in the 1960s, 14 in the 1970s, 21 in the 1980s, 29 in the 1990s, and 61 in 2004.

Emily, the current top choice, hit its low point in the 1960s, when it ranked 251.

This information can be used in lots of different ways in the classroom, from analyzing and plotting data to tracking historical trends. The use of Roosevelt as a first name, for example, peaked in the 1910s at 121, but there was a second peak (though not as high) in the 1930s, probably associated with the presidency of Franklin Delano Roosevelt. Indeed, the name Franklin peaked at 71 in the 1930s.

The NameVoyager display was developed by Laura Wattenberg, with help from her husband, Martin Wattenberg, who specializes in data visualization at IBM Research in Cambridge, Mass. You can see a sampling of different visualization techniques that he has worked on at http://www.bewitched.com/research.html.

One of the most intriguing is something called “history flow” (http://www.research.ibm.com/history/). This technique provides a way to record contributions to a complex, evolving collaborative project—an extension of the tracking of changes that can occur when an essay is revised, an article edited, or a computer program updated and debugged, but in a much larger context.

For a project to which many individuals contributed, a “history flow” visualization provides readily discernible information on, for example, how much each of various authors has contributed to the text. Was it largely the work of one individual, or were more than a handful of people involved? Were there distinct spurts of activity, or has the document grown smoothly over time?

In essence, the technique plots contributions to a document as color-coded lines, with a different color for each author. As various authors add and delete material, changes in the length and color-coding of the line track how the document evolves. Text that remains unchanged from one point in time to another is connected, creating bands of color—a timeline visualization. As in NameVoyager, it’s possible to highlight the contributions of individual authors, for example. You can also pinpoint which parts have been changed most often and how old individual components are (the brightness of color depends on the age of the contribution).

Watternberg and his colleagues have applied the technique to visualizing the evolution of pages from the free online encyclopedia known as Wikipedia (http://www.wikipedia.org/). This encyclopedia is basically a community project, created by people from all over the world who come to the site and contribute to its contents.

The researchers note that most Wikipedia pages, particularly those on controversial topics, have been vandalized at some point in their history. Their visualizations indicate, however, that the vandalism is nearly always repaired very quickly—often so quickly that most users never see the damage.

It’s evident that pages on different topics vary in how often and how much they’re updated. Some pages grow gradually over time, while others grow in bursts. At the same time, most pages appear to keep growing, rather than stabilizing at a certain size.

Interestingly, some text persists for a very long time, giving these authors the satisfaction of knowing that their contributions have somehow survived years of persistent editing.

Such results suggest intriguing lines of research, from attempts to understand the frequency and timing of vandalism to what characteristics lead to high-quality pages.

More Stories from Science News on Math