Human Genome Work Reaches Milestone
By John Travis
It’s official. Biology’s hottest race has been declared an amicable tie, even though one competitor has a clear lead and neither has actually reached the finish line or knows exactly what the prize contains.
In an accomplishment being compared to landing a man on the moon, rival groups of scientists from the private and public sectors announced on June 23 that each has read essentially all of the 3 billion or so letters that spell out the human genome, the genetic information encoded within the 6 feet of DNA coiled up in every human cell.
“Today, we celebrate the revelation of the first draft of the human book of life,” said Francis Collins of the National Human Genome Research Institute in Bethesda, Md., at a White House celebration.
The announcement, considered premature by some scientists, nevertheless drew praise from leading biologists and government officials worldwide. Calling the reading of the genome “a stunning and humbling achievement,” President Clinton stressed its medical implications.
“With this profound new knowledge, humankind is on the verge of gaining immense new power to heal. Genome science will have a real impact on all of our lives—and even more, on the lives of our children. It will revolutionize the diagnosis, prevention, and treatment of most, if not all, human disease,” he said.
While scientists celebrated the genome announcement, some of them appealed for stronger federal legislation to protect people from genetic discrimination by their health insurance companies or employers. The unveiling of the human DNA sequence should offer a “wake-up call,” demonstrating that we can’t delay addressing this issue any longer, says Collins.
The world received its initial look at deoxyribonucleic acid, better known as DNA, more than a century ago. In 1869, German scientist Friedrich Miescher isolated a novel chemical, now known to be DNA, from immune cells left in the pus on bloody bandages.
It took nearly a century for scientists to recognize DNA as the hereditary material of plants, animals, and microbes and determine that its molecular shape is a double helix reminiscent of a spiral staircase. The two sides of the helix consist of complementary strings of so-called bases, which come in four forms that geneticists abbreviate A, C, T, and G. One base joins with a partner on the opposite strand — A with T, or G with C — to create the steps on DNA’s staircase.
The sequences of bases within a gene encode the information that a cell uses to build a protein. In the late 1980s, several scientists raised the provocative idea of sequencing all human genes as well as the even greater lengths of DNA in between—whose functions were, and still are, largely unknown.
“There were people who thought this was sheer lunacy,” recalls Collins.
This week’s announcement emerged from an uneasy truce forged between Celera Genomics, a biotech firm in Rockville, Md., and the Human Genome Project, a publicly funded, international consortium of scientists now led by Collins. The latter group, formed in 1990, had the task of sequencing the human genome largely to itself until 1998. Then, geneticist J. Craig Venter brashly predicted that his new company, Celera, would do the same job in less time and for less money (SN: 5/23/98, p. 334).
The competing camps took different approaches. The public effort, funded in large part by two U.S. agencies—the National Institutes of Health and the Department of Energy—and Wellcome Trust, a British charity, first broke the genome into manageable chunks of DNA. The group then mapped the order of these pieces, called clones, with respect to each other. Only in the past few years have Collins and his colleagues focused on sequencing those clones.
In Venter’s more radical strategy, previously used to read the genes of many bacteria and the fruit fly, Celera shattered human DNA into snippets whose ends were immediately sequenced. The firm then identified overlapping base sequences among the DNA fragments. By analyzing an amount of DNA equivalent to many genomes, the scientists hoped to accumulate, and put in order, enough end sequences to construct an entire genome.
This method, called whole-genome shotgunning, is akin to shredding many copies of a book and reassembling one copy sentence by sentence, whereas the clone-by-clone technique is more like first breaking a book into ordered chapters, and then performing the shredding and assembly within those chapters, says Eric Lander, a geneticist at the Whitehead Institute for Biomedical Research in Cambridge, Mass.
Most geneticists had considered a shotgun approach to the entire human genome to be too difficult. Human DNA, they argued, harbored many repetitive sequences, which would thwart any assembly attempts.
On June 23, Venter and his colleagues proved the skeptics wrong. That afternoon, after more than 500 million trillion comparisons of base sequences, Celera’s powerful supercomputers finished their first assembly of the human genome. The resulting sequence, approximately 3.12 billion bases long, spans more than 99.9 percent of the human genome, says Venter.
This assembled sequence sits in Celera’s databases, open to companies and schools that pay a subscription fee. The only academic subscriber so far, Vanderbilt University Medical Center in Nashville, expects to start accessing the data next week.
Although universities had feared that Celera would demand rights to patents or discoveries their professors make with the firm’s data, Vanderbilt did not grant the company any such options, stresses Lee E. Limbird, the center’s associate vice-chancellor for research.
Venter announced that Celera would make a limited version of its database available at no charge after the company publishes its results.
The results of the public genome effort flow daily into a database available free to researchers worldwide. In fact, Celera has made use of the public data in its own genome assembly.
The public version of the genome sequence represents a working draft in which scientists have put in order DNA fragments covering 97 percent of the genome and sequenced more than 85 percent of them to varying degrees of accuracy. The two groups plan to independently but simultaneously publish papers this fall that will provide additional details and perhaps set the stage for a joint conference in which scientists fully analyze both sets of data.
The effort to unravel the human genome does not offer as tangible a climax as an astronaut stepping onto the lunar surface. Some geneticists have even called this week’s announcement more a matter of public relations than a true finale to the genome effort.
“The point at which both Celera and the public project have chosen to announce the ‘completion’ of the genome is wholly arbitrary and does not correspond to any scientifically justifiable criterion for completion. The previously agreed criteria of few or no gaps, gaps of known size, and an error rate of 1 in 10,000 bases have clearly not been met by either group,” charges Philip P. Green of the University of Washington in Seattle.
Indeed, the two groups did not offer any details on how many holes remain in the two genome sequences or claim to meet the 99.99-percent target for accuracy.
“We should not be satisfied with a book of life that has gaps and errors in it,” concedes Collins, emphasizing that the public effort will continue to improve the quality of its genome information. Venter is also careful to call Celera’s work the “first assembly” of the human genome, implying that better versions will come.
Illustrating the large gaps in knowledge that remain, neither Celera nor the international consortium answered the question of how many genes people have. In the June Nature Genetics, three research teams using different methods offered predictions that range from 30,000 to 120,000 human genes. Several scientists have established a pool in which a person can record a prediction for a $1 wager.
To aid their analysis of the human genome, biologists plan to sequence all the DNA of several more animals. Celera has already moved on to the mouse genome. The rat and the zebrafish genomes should follow quickly, adds Collins.
Also, some scientists plan to map all the interactions between the many thousands of proteins encoded by the human genome, and others intend to determine the three-dimensional structure of each molecule.
Norton Zinder, a geneticist at Rockefeller University in New York, recently offered an elegant analogy for how the human DNA sequence merely sets the stage for future research. In the June 12 New Yorker, he compared the revealing of the genome to the 1543 publication of the first book on human anatomy.
Even though that book identified almost all the parts of the human body, physicians today still struggle to understand how many of them work and interact. A similarly daunting task—one that scientists say may also take several centuries to complete—now awaits those who seek to make sense of the myriad genes of the human genome.