Biological Moon Shot
Realizing the dream of a Web page for every living thing
By Susan Milius
- More than 2 years ago
Richard Pyle hasnât gotten a congratulatory crate of free diapers. But heâs one of the fathers, in a sense, of the first fish species named in 2008. Quintuplet species even. The journal Zootaxa posted descriptions of five damselfish on Jan. 1 that Pyle and his colleagues at the Bishop Museum in Honolulu found using a specialized mix of gases to push beyond the depth limits of conventional SCUBA gear.



In a few weeks, the rest of us will be able to catch up on such frontiers of exploration a lot more easily. A sweeping informatics project called the Encyclopedia of Life is scheduled to launch its first trial entries on the Web in late February (SN: 5/12/07, p. 294). According to the plan, the encyclopedia portal will provide access to roughly 30,000 Web pages of specialistsâ dataâone page for each of the known species of fish.
And thatâs just a baby step. Unveiled in May 2007, the Encyclopedia of Life project envisions such powerful tools for managing and centralizing biological information that a decade from now anyone visiting www.eol.org should find the Mother Nature of all encyclopedias: easy access to a Web page with definitive, current information on each species on Earth.
No one can say how many Web pages that total coverage will need. The encyclopediaâs godfather, biologist E.O. Wilson of Harvard University, speaks of 10 million. âIt should be thought of as a biological moon shot,â he says.
He and his fellow encyclopedists argue that if they realize their ambitious dream, theyâll change the science of biology. They propose that the new informatics methods and centralized Web portal will speed up the old, underfunded business of figuring out whatâs what (or whoâs who) among living things. And the speedier tools will drive novel inquiries, including an expansion of the study of networks, such as food webs, and the search for evolutionary patterns.
Planners also hope itâs not just for science. Using the new tools to climb the tree of life should be funâfor scientists as well as for poets and plumbers and kids. If all goes according to plan, the Encyclopedia of Life will be cool.
Roots
Like flying to the moon, making one encyclopedia of all life is an old idea that technology might finally make possible.
The urge to produce an overarching view of living things goes back at least to Aristotle. Even the idea to make that long list in Latin with two names for each species goes back more than 250 years, to Carl Linnaeusâ foundations for biological nomenclature. Hence, Wilson wrote in an early proposal for the encyclopedia, people âassume taxonomy all but wound down generations ago.â
Not true. So far scientists have given formal names to only about 1.8 million species. Published estimates for the actual number of species on Earth range from 3.6 million to upwards of 100 millionânumbers based on extrapolations and a fair bit of outright guesswork. In many ways, taxonomy has barely begun.
And while scientists have identified many of the largest and most obvious species, they very likely havenât found the most important. Marine biologists didnât describe the bacterial genus Prochlorococcus until 1988. Yet these picoplankton, barely visible with optical microscopy, floating in aquatic clouds and capturing the energy of sunlight through photosynthesis, account for a significant proportion of marine productivity.
Likewise of the extraordinarily numerous hairlike roundworms called nematodes that wriggle through soil or colonize plants and animals, only a small percentage have names.
Even with the biology you can see, scientists are playing catch-up. According to Wilson, the number of known frogs and other amphibian species has jumped from 4,000 to 5,400 over the past 15 years. New plant species continue to join the roster at a rate of about 2,000 a year.
These measures reflect only the first step of naming an organism. How it lives, what it eats or gets eaten by, and whether people might find it useful or dangerous or charismatic often remain unknown. Yet the growing human population redirects the fates of these species, pushing some toward new habitats and others toward extinction.
âWeâre sailing blind into our environmental future,â Wilson told attendees at the 2007 TED conference, a gathering of luminaries in technology, entertainment, and design. Wilsonâs pitch marked the opening night of the current effort to upgrade biological information tools.
After several years of behind-the-scenes campaigning, Wilson and other planners had secured seed money for the project: $10 million from the John D. and Catherine T. MacArthur Foundation and $2.5 million from the Alfred P. Sloan Foundation. A consortium of museums and other science institutions is organizing to get the job done.
Fish first
At one of those institutions, the Smithsonian in Washington, D.C., the encyclopedia executive director, James Edwards, is in charge of seeing that this moon shot doesnât fizzle.
Sample encyclopedia Web pages show flashy images and videos plus links to the latest genetic sequences and a scan of the page of the book in which the first published description of a species appeared. Cool, yes, but time-consuming. Developing entries of that quality for millions of species will take years, and Edwards doesnât want the world to lose interest in the meantime.
So, the encyclopedia will release something fast, but just a small something: a portal to basic info on fish. The creators will present the pages as a work in progress, soliciting user comments.
Visitors will be able to admire a portrait of the zebra turkeyfish and a map of its range in the Pacific, for example, or learn that the white-spotted boxfish typically frequents tropical waters 1 meter to 30 m deep. The modern Latin names will be paired with tables of common names in dozens of languages.
The fish information itself wonât be an encyclopedia creation. Instead, the informatics specialists are building a new portal to an existing site, called FishBase. This strategy illustrates how such a grand undertaking as the compendium of all living things might just be possible. The project wonât start from scratch with 10,000 taxonomists typing until they create an encyclopedia. Specialists have already made databases with reliable information, and the encyclopedia will provide a central entryway for using these trusted sources.
âEverybody wants his or her favorite organism there first,â says Edwards. âIf youâre a leech lover, you want leeches. If youâre a spider lover, you want spiders.â What the encyclopedia crew is actually going to present next, with or just after the fish, are plants in the Solanaceae familyâincluding tomatoes, peppers, petunias, tobaccos, and potatoes. âItâs timely, because 2008 is the International Year of the Potato,â says Edwards. (Not a joke. See âItâs Spud Timeâ.)
As the Encyclopedia of Life grows, its tools will capture the latest research to enrich those sources. Google-like aggregation technology will register new publications or gene sequences, for example, that appear on the Web.
âThe most exciting thing about this project to me is that we have a blizzard of information coming at us all the timeâand itâs not just in science, itâs everywhere,â says Mark Westneat of the encyclopedia group based at the Field Museum of Natural History in Chicago. Financiers monitoring markets and even travelers wondering whether to pack boots have some fine systems for sifting out the desired snowflakes from all the rest of the information. âBiologists are a little bit behind in informatics tools,â he says.
The fish segment illustrates another feature of the encyclopedia plan: the quality of sources. Westneat, who studies reef fishes, encountered FishBase in its larval stage at a biologistsâ gathering in the Philippines in 1995. One of its originators, fish biologist Rainer Froese, brought an early version of this database and appealed to his colleagues to groom glitches out of it and supply photographs. âWe grudgingly did so,â says Westneat. âWe thought, âOh, this will be nice for school kids and stuff, but Iâll never use it.'â Then heroic efforts by William Eschmeyer of the California Academy of Sciences in San Francisco standardized the taxonomy with up-to-date forms and lists of synonymous names. âAll of a sudden, FishBase became this incredibly valuable resource,â Westneat says. âI use it every day.â
Such trustworthy information isnât just swimming free in the seas. âA significant challenge facing the Encyclopedia of Life is engaging the scientific community to provide content,â says botanist Richard Ree of the Field Museum. âSimilar initiatives have been tried in the past, and I think itâs safe to say that none met with resounding success.â
Ree does add that the project has advantages over previous proposals. The star power of E.O. Wilson and the TED conference attendees could catalyze interest from the corporate sector and allow access to its considerable experience in developing tools for managing computer information.
The encyclopedia planners are well aware of the need for active support from scientists, says Westneat. He leads a team focusing on how to make the encyclopedia so useful that scientists will decide that providing top-quality information is worth their time. âThe scientific community is going to make the Encyclopedia of Life rich, and itâs going to make it correct,â he says. In turn, that gold standard information should enrich the specialistsâ pursuits.
If only
As an example of such a pursuit, Westneat describes the travails of Jennifer Fessler, one of his students, who has just finished revising the taxonomy of the gorgeous but confusing butterfly fish.
She discussed fish distribution, which meant refining maps of ranges for some 50 species. The Global Biodiversity Information Facility database let her download information on museum specimens worldwide to find collection spots for the coral reef fish. That resource certainly helped, but so far there isnât a good automated way to check for typos in the latitude and longitude. Mappers like Fessler must slog through data looking for anomalies. âThereâll be this record of a coral reef fish in the middle of the Midwest,â Westneat says. Between proofing the locations and putting data into the right format, the work took Fessler months. âWhat if you could do that in a couple of minutes?â Westneat daydreams.
Parts of the job of revising or creating species names could get faster, but overall âitâs not something that can be done at the speed of light,â says Corrie Moreau of the University of California, Berkeley.
For example, Moreau is now considering whether small Pheidole hyatti ants, with their distinctive, large-headed soldiers, represent just one species or several. Yellowish individuals show up on desert floors, but a darker form dominates higher and shadier habitats. To sort out the problem, she and her collaborator, Stefan Cover of Harvardâs Museum of Comparative Zoology in Cambridge, study ants in the wild but also need lots of other resources. The project requires reviewing literature on the species and its relatives dating back at least 100 years, examining museum specimens and collecting new ones, and sequencing stretches of DNA.
Even though she expects systematics will always demand time, Moreau says she would welcome any streamlining that the Encyclopedia of Life could offer. She could untangle her ant puzzle faster if she had a central source for reviewing early descriptions, high-detail portraits of specimens, and new DNA work.
Her wish about the old publications is already, albeit slowly, in the process of coming true. Thomas Garnett of the Smithsonianâs National Museum of Natural History heads a scanning and digitization group of encyclopedia workers. They are cooperating with the Biodiversity Heritage Library, a project through which 10 major libraries are scanning and placing on the Web pages from volumes that describe species. Some 80 million pages come from publications old enough to be in the public domain, and the scanners are starting with those.
âThe scanner is the person; the machine is the Scribe,â explains Martin Kalfatovic, Garnettâs colleague and a digitization expert, as the two tour the disappointingly not-heaped-with-cobwebby-dino skulls, well-lit basement of the museum. There, in a large, mostly empty room is the scannerâa real person who sounds pretty sane for someone who turns 3,000 pages a day.
The scanner sits in front of the Scribe machine, a highly evolved computer desk with paired cameras and links to massive bibliographic databases, all inside a booth covered by black canvas. He deftly settles a thin entomology volume with sallow pages into a V-shaped cradle that keeps the bookâs elderly spine from having to strain all the way open. A foot pedal sends a hovering glass cover down just so to flatten the half-open book pages. With a synchronized jachick, a pair of cameras shoots the two visible pages. Capturing the image, it turns out, is just the beginning. Software allows images to be corrected for off-kilter angles and other flaws and to be tied to catalog information in the databases.
As of Jan. 25, the project has scanned 3,661,118 pages, Garnett says. The projectâs Web site (www.biodiversitylibrary.org) opens virtual access to a number of rare-book-room treasures: a 1484 guide to medicinal plants from Mainz, Germany, and Robert Thorntonâs 1807 New Illustration of the Sexual System of Carolus van Linnaeus with full-page glamour portraits of flowers against moonlit rivers or other dramatic backgrounds.
Garnett points out that the century-old volumes of Biologia Centrali-Americana have also gone online. Both botanists and zoologists need this basic work when tracing the history of species descriptions. Yet, he says, âthere are only two copies in Central America.â
In talking about the vital business of opening library resources to far-flung scientists, Garnett rolls his eyes at the mention of a specialized source for historians of science that has become one of the libraryâs most popular downloadsâthe 1904 treatise Ants and Some Other Insects: An Inquiry Into the Psychic Powers of These Animals.
Cruising
The broad appeal of psychic ants raises the point that this isnât just about scientists. âThe other audience weâre targeting is middle schoolers,â says Westneat. âTheyâre very quick. Theyâre interested. Theyâre also capable of handling complex ideas.â Plus, theyâre agile surfers.
Again he draws on the experiences of FishBase. Useful as it is to ichthyologists, they account for only a small percentage of the visitors. Aquarium hobbyists, fishing enthusiasts, and just plain curious browsers click into the site from all over the world.
When the Encyclopedia of Life matures, Westneat says, he hopes that it, too, attracts what he calls âWhatâs in my backyard?â questions. Designers are working on ways that someone might see an orange butterfly in Chicago in June and then get the encyclopedia to display a gallery of photos of the likely species.
But that example barely touches the power of the Web. âImagine all 2 million known species in this grand family tree of life,â says Westneat. âWhat if you could have that tree of life floating in space on your computer screen and zoom in on the birds and see a blackbird and a hummingbird and a hawk popping up on the branches, the way the restaurants pop up in Google Earth when you zoom in on Chicago? Just imagine the fun that middle school kids will have cruising around the tree of life and finding the narwhal and all the cool animals.â
Imagine the fun any of us would have. The curious might come upon the page for the deep blue chromis (Chromis abyssus) named by the Honolulu team. The damselfish and three of its recently discovered kin swim at depths of at least 85 m in a poorly understood habitat sometimes referred to as the coral-reef twilight zone. So C. abyssus has deep-blue spots as well as a deep habitat for a damselfish. Itâs a gentle example of taxonomy humor, yet another frontier for Web surfers to explore.