Reports of junk DNA’s ‘demise’ were based on junky logic and dubious definitions

Science is an oddly successful enterprise. On the whole, it provides an impressive guide to reality. From antibiotics and atomic bombs to laser beams and X-rays, science enables humans to forge powerful tools from nature’s secrets.

Yet many aspects of science are deeply flawed, from the politicization of research funding to widespread misuse of math in analyzing data.

In this respect science is not so different from human biology. Magnificent organisms capable of composing symphonies, calculating quantum energy levels and dunking basketballs are built from DNA molecules containing 90 percent junk.

At least that was the prevailing biological wisdom until last September. Previous studies of the human genome — the catalog of all human DNA and the genes made from it — showed that most DNA was junk, with no biological importance for the survival of the species. But then came a report in Nature from the ENCODE project (for ENCyclopedia of DNA Elements). It proclaimed that 80 percent of the human genome could be assigned a “biochemical function.” News reports heralded the demise of “junk DNA.”

Since then, though, some scientists have begun to analyze the ENCODE papers the way ENCODE analyzed the genome, and have reached an entirely different conclusion. Not only is most of the genome junk, it seems, so is the ENCODE analysis. It uses a questionable definition of function and commits various logical fallacies in applying it, contend Dan Graur of the University of Houston and several collaborators.

“ENCODE not only uses the wrong concept of functionality, it uses it wrongly and inconsistently,” Graur and colleagues assert. ENCODE’s position on the nonexistence of junk DNA “was mainly based on several logical misconceptions.”

Biologists have long known that DNA has important functions besides its main job of providing blueprints for proteins. By the traditional “selected effect” definition of function, not just protein-coding regions but all portions of DNA preserved by evolution to help an organism survive and reproduce would be considered functional. The rest is nonfunctional “junk.”

In contrast, some philosophers, a few biologists (and ENCODE) endorse a “causal effect” definition of function. In that case, “causing” anything counts as a function. Since your heart thumps, making rhythmic sounds would be listed as one of its functions, for instance.

ENCODE found that 80 percent of the genome did something, such as providing a site where certain proteins could attach themselves. Sometimes, of course, the attachment of a protein does something important, such as turning on a gene. But lots of times it may do nothing else of value (like a heart that makes noise without pumping blood). Yet because some attachments have a function, ENCODE lists all such attachment sites as functional parts of the genome — a logical fallacy known as “affirming the consequent.”

“The ENCODE authors applied this flawed reasoning to all their functions,” Graur and colleagues write in Genome Biology and Evolution. As a map of the genome, “ENCODE is considerably worse than even Apple Maps,” they conclude.

“It is safe to state that the news concerning the death of ‘junk DNA’ has been greatly exaggerated,” they write. “The vast majority of comparative genomic studies suggest that less than 15 percent of the genome is functional according to the evolutionary conservation criterion.”

Similar criticisms appear in a recent issue of the Proceedings of the National Academy of Sciences. If very little DNA is junk, asks W. Ford Doolittle, then why do so many other organisms possess so much more DNA than people do?

Lungfish, for instance, have something like 30 times as much DNA as humans, without any sign of more complex biological functioning.

“If the human genome is junk-free, then it must be very luckily poised at some sort of minimal size for organisms of human complexity,” writes Doolittle, of Dalhousie University in Halifax, Canada.

“We may no longer think that mankind is at the center of the universe, but we still consider our species’ genome to be unique,” he writes. Such “genomic anthropocentrism” mixed with confusion in the meaning of “function” and questionable statistics all contributed to this “attempt to junk ‘junk,’ ” he writes.

In the end, of course, it really doesn’t matter so much whether all, most or only a little of DNA is junk. What’s more troubling is the presence of so much logical junk published in scientific journals. The caliber of scientific discourse has been degraded in the modern world of mass publishing. Numerous studies have documented poor performance by the peer review system for refereeing papers before publication. ENCODE’s flaws are not rare exceptions — logical lapses are common in scientific publications, as are questionable definitions and inappropriate use of statistics. As guardians of quality control, scientist referees make basketball officials look good.

Yet somehow science still succeeds in bringing more and more of nature under the umbrella of human understanding. Or at least it used to. Let’s hope science can continue to succeed, despite all the junk in its publicationome. 

Tom Siegfried is a contributing correspondent. He was editor in chief of Science News from 2007 to 2012 and managing editor from 2014 to 2017.