Missing Lincs

Lesser-known genetic material helps explain why humans are human

Nearly everybody knows that Frank Lloyd Wright designed Fallingwater, the house in Pennsylvania that sits above and appears to cascade into a waterfall. I.M. Pei’s glass pyramid at the Louvre in Paris is similarly famous. And Frank Gehry is widely known for the curvi­linear shining steel Walt Disney Concert Hall in Los Angeles.

Long strands of genetic material called lincRNAs (blue) may guide protein-DNA interactions, among other tasks. Sigrid Knemeyer
A long noncoding RNA called XIST is responsible for coating one chromosome (red) in female mammals (mouse cell shown), turning off that chromosome. genecleaner/Wikimedia
RNA DISTRACTORS MicroRNAs can glom on to the backs of messenger RNA and block protein production (left). But lincRNAs with the right binding sites (orange) can act as decoys, distracting microRNAs and allowing protein production at the ribosome to proceed (right). Sigrid Knemeyer
SEEING PATTERNS | Researchers recently identified 30 lincRNAs that appear to act as barriers preventing an embryonic stem cell from turning into various tissue types. Blue boxes indicate where the knockdown of a lincRNA led to genetic activity patterns characteristic of differentiation. M. Guttman et al/Nature 2011

But most people couldn’t name the contractors and subcontractors responsible for translating those great architects’ blueprints into solid structures.

Geneticists have the same problem. Details for erecting an organism’s structure are encoded within DNA, written in chemical subunits designated by the letters A, T, C and G. But it has been hard to say exactly who takes those details and oversees the construction of the organism from proteins and other molecular materials.

Only now have scientists begun identifying the previously invisible contractors who make sure that materials get where they are supposed to be and in the right order to build a human being or any other creature. Some of these little-known workers belong to a class of molecules called long intergenic noncoding RNAs.

Scientists used to think that these “linc­RNAs” were worthless. As their name suggests, these molecules — at least 200 chemical letters long — do not encode information that the body’s manufacturing machinery can use to cobble together proteins. And the lincRNAs originate in what scientists used to view as barren wastelands between protein-coding genes. But new research is showing that these formerly underappreciated workers have important roles in projects both large and microscopic.

“They regulate every process under the sun,” says John Rinn, an RNA researcher at Harvard Medical School.

In the last few years, scientists have learned that lincRNAs, as well as other RNAs that are long and noncoding but not intergenic, perform a variety of jobs. Some serve as guides showing proteins where to go, while others tether proteins to different types of RNA, or to DNA. Some work as decoys, distracting regulatory molecules from their usual assignments. Some may even have multiple roles, all the while chattering away to other RNA within cells. (It is not idle gossip; RNA communication within cells may ward off diseases such as cancer.) And as the ultimate multitaskers, lincRNAs keep proper cellular development ticking along and help define what makes mice mice and people people.

New ‘genes’

LincRNAs are just one type of multitalented RNA that has until recently been undervalued. While DNA has been revered as a precious archive housing genetic secrets, RNA — a chemical cousin copied from DNA — has long been viewed as little more than an errand boy. The name of one important type of RNA that goes on to make proteins, messenger RNA, reflects that bias. Two other types of RNA, transfer RNA and ribosomal RNA, help read messenger RNA during protein creation, leaving many scientists to assume RNA’s only job was as a low-level employee on the protein production line.

That notion began to change when researchers with the Human Genome Project finished compiling the human genetic archive a decade ago. That archive contains about 3 billion genetic letters, far more than the genomes of less complex organisms such as roundworms and fruit flies. But the project revealed that people’s roughly 22,000 protein-specifying genes don’t greatly outnumber those found in simpler organisms.

The finding was one of the biggest surprises in biology. An average C. elegans roundworm contains just 959 cells; the human brain has an estimated 100 billion neurons — more than the total number of cells in 100 million roundworms. How could it be that the same set of building materials used to construct simple lowly worms could also produce vastly more complex humans? It turns out that, in the same way a bucket of LEGOs or an Erector Set can yield an array of toy structures depending on how the pieces are put together, the answer lies in assembly.

One of the first clues to the importance of assembly came from the FANTOM project (for “functional annotation of the mammalian genome”), an ongoing international effort to identify every piece of RNA made in a mammalian cell. In 2005, the research revealed that even though genes that code for proteins make up only 1.5 percent of the mouse genome, more than 63 percent of the genome’s DNA is copied into RNA. In humans the number is even higher, with up to 93 percent of the genome made into RNA, even though protein-coding genes make up less than 2 percent of the genome.

At first, many scientists didn’t know what to make of the excess RNA. Some thought it was overexuberance on the part of the DNA-copying machinery. But gradually researchers began to realize that many of those extra RNAs had important jobs to do.

If RNAs can do important work without making proteins, the definition of a gene needs to be expanded, says Leonard Lipovich, a geneticist at Wayne State University in Detroit. “You realize there are not 20,000 genes; there are 40,000.”

Last year, Lipovich and colleagues reported in the journal RNA that they had found evidence of 6,736 long noncoding RNAs encoded in the human genome. Some estimates suggest that the human genome may encode more than 10,000.

Further work, reported by John Mattick of the University of Queensland in the June issue of FEBS Letters, suggests that the number of noncoding RNAs of all types is greater in humans than in other primates and greater in those primates than in mice. The same logic follows on down to puffer fish, fruit flies and beyond.

With so many long noncoding RNAs floating around in cells, the next question to answer is what do they do. “For the vast majority, we have absolutely no idea,” says Ahmad Khalil of Case Western Reserve University in Cleveland.

Some, though, appear to act like general contractors — not hammering in the nails and pouring the foundations of cells themselves, but dictating how the job should be done.

Long lineup

One of the most famous long noncoding RNAs, known as XIST, is also one of the most hands-on. XIST is in charge of shutting down one of the X chromosomes in every single cell of women and girls. Women and other female mammals have two copies of the X chromosome, while males have one X and one Y chromosome. Having a double dose of X chromosome genes could be harmful, even lethal, so women turn one off. XIST — a “lncRNA” because it is 19,000 letters long, noncoding but not intergenic — directs the decommissioning.

XIST doesn’t have a long commute to work; it coats whichever X chromosome makes it, preventing other genes on the chromosome from being activated. XIST isn’t made in males; a partner long noncoding RNA called TSIX (XIST spelled backward) helps keep the other X chromo­some in women in working order.

One of the most well-studied linc­RNAs, named HOTAIR, wasn’t lucky enough to get a job close to home. It is copied from DNA on chromosome 12 but has to travel to chromosome 2 to shut down several genes in a group known as the HOXD cluster, genes important for proper development of an organism, Rinn and colleagues reported in Cell in 2007. A study reported last year in Nature, from Stanford University genomicist Howard Chang and his colleagues, showed that HOTAIR works at 854 different job sites.

Not only does HOTAIR help direct development, but it is also important throughout life to help cells pinpoint their location in the body.

Breast cancer cells are full of HOTAIR, especially those that migrate throughout the body, Chang’s team reported in the same paper in Nature. Cells in about a third of breast tumors studied made more than 125 times as much HOTAIR as normal breast cells do. And when copies of HOTAIR pile up in cancer cells, the cells start to drift away from the tumor. Having too much of the lincRNA appears to reprogram a cell’s internal compass, says Chang, who is also a Howard Hughes Medical Institute scientist. “It’s kind of like a car driven by a faulty GPS device.”

Whether promoting health or mis­directing cells, lincRNAs don’t necessarily act alone. HOTAIR and some other lincRNAs direct crews of proteins known as histone modifiers. Histones are spoollike proteins that wrap up DNA so it can fit inside the cell nucleus. Where the histones sit along the DNA and how tightly the DNA is wrapped around them affects whether genes are turned on or off. HOTAIR works a bit like a surveyor putting chemical tags on histones associated with genes that need to be put under tight wraps. The lincRNA does this by bringing two different groups of proteins to selected genes. One of the protein groups plasters a closed sign on a histone protein, signaling that the gene is not open for business, while the other group rips down billboards advertising for the gene, Chang’s group reported last year in Science.

A lincRNA known as HOTTIP also works with a crew of histone modifiers, but instead of shuttering genes, HOTTIP’s crews hang grand-opening signs to attract gene-activating machinery. Working in concert with proteins, these and other lincRNAs can precisely control which genes are turned on and when — a precision necessary to coordinate the untold steps that go into cooking up a human.

Contractor communication

In the recipe for humans, lincRNAs are in the thick of things from the very beginning. At least 26 different lincRNAs need to be on to keep an embryonic stem cell a stem cell, Rinn and his colleagues reported in the Aug. 28 Nature. As stem cells transform into various types of cells, they turn off some specific lincRNAs and turn on others, creating a mix of activity that can define the cell.

“They describe a cell’s identity better than protein-coding genes do,” says Rinn, who is also affiliated with Beth Israel Deaconess Medical Center and the Broad Institute of MIT and Harvard. He and his colleagues have developed computer software that attempts to pinpoint a cell’s type by looking at which lincRNAs it makes.

In a molecular name-that-cell contest, lincRNAs beat out proteins. Each type of cell has such a special mix of lincRNAs that the computer program can correctly identify a cell from only two lincRNAs, while four or five proteins are required for a positive ID.

Just how lincRNAs choose which genes to turn on and off isn’t yet known. But Pier Paolo Pandolfi, a geneticist at Beth Israel Deaconess and Harvard Medical School, suspects that the lincRNAs are whispering to each other and to other RNAs, keeping tabs on all a cell’s goings-on. Pandolfi laid out his hypothesis for how this chatter might help control protein production and other processes in the Aug. 5 Cell.

Protein-coding messenger RNAs are often prevented from making their instructions into proteins by tiny snippets of RNA called microRNAs. The microRNAs glom on to certain messenger RNAs like paparazzi crowding a celebrity.

If lincRNAs and messenger RNAs share a string of chemical letters that interest the microRNAs, the lincRNAs can serve as a decoy, distracting microRNAs so messenger RNAs can get their jobs done. As long as two molecules both have the signatures that the microRNAs seek, the little RNAs will clamor around both messages with equal fervor.

Last year, Pandolfi’s group found that the RNA copied from a pseudogene (a defunct copy of a gene that no longer makes proteins) can attract microRNAs away from the messenger RNA copied from a real gene called PTEN, which is important in protecting against cancer (SN: 7/17/10, p. 14). That finding led to the suggestion that, by acting as decoys, RNAs could regulate protein production or other processes within a cell. Pandolfi calls this idea the ceRNA hypothesis (for “competing endogenous RNA”).

A growing body of evidence suggests that Pandolfi’s hypothesis is more than just a clever idea. Researchers at Columbia University performed a computerized search of the human genome and found 7,000 genes whose messenger RNA copies could act as microRNA decoys in 248,000 interactions, a result reported in the Oct. 14 Cell.

The Columbia team and Pandolfi’s team independently found that tweaking levels of a few messenger RNAs that distract microRNAs from PTEN messenger RNA can lead to prostate cancer or a type of brain tumor called glioblastoma. Just messing with levels of a messenger RNA from another gene known as ZEB2 throws off PTEN protein levels and can lead to melanoma in mice, Pandolfi’s group reported in another paper in the Oct. 14 Cell.

Some lincRNAs also appear to contain microRNA magnets. Researchers from Italy report in the same issue of Cell that they have found an RNA called linc-MD1 that is important in muscle development. The lincRNA sponges two microRNAs away from the messenger RNA of two genes, allowing more muscle-building proteins to be made from those genes. Cells taken from people with Duchenne muscular dystrophy have lower levels of linc-MD1 than normal muscle cells do. Without linc-MD1 to draw them away, the microRNAs pile on to the messenger RNAs and prevent the muscle proteins from being made.

With so many interconnected parts in the system, researchers need to think carefully about the consequences of tweaking levels of any RNA within a cell, Pandolfi says.

If any decoy isn’t made, potentially hundreds of conversation partners could be affected, with amounts of proteins they each produce being altered by 10 to 30 percent.

“People say, ‘that’s ridiculous. That’s nothing,’ ” Pandolfi says. But if 200 or more conversation partners are each perturbed by 10 percent, “that’s devastating.”

Losing one noncoding RNA may be disastrous for a cell, but for want of noncoding RNAs whole species may never have evolved, argues Queensland’s Mattick. He and others say the real function of lincRNAs is to give evolution a sort of molecular clay from which to mold new designs.

New proteins rarely appear, leaving evolution a limited set of building materials to work with, Mattick says. But new lincRNAs pop up all the time; some of them appear in only one species. Given the molecules’ jobs as directors and overseers, evolution may use them to make design variations on the fly.

Humans have several lincRNAs that are found in no other species. Many of those RNAs are made in the brain, leading scientists to speculate that the molecules may be at least partially responsible for that important organ’s evolution.

If Mattick is right, making a human from the same building materials used to create roundworms and fruit flies doesn’t pose such a puzzle. It does, however, mean finding the right contractors.


‘LincRNA’ lingo

Long  RNAs come in all different lengths, as measured by the number of chemical subunits in the strand. “Long” versions, sometimes called “large,” are at least 200 chemical units long, while RNAs of the “micro” variety typically have around 22 units.

Intergenic  “Intergenic” refers to RNAs that are copied from portions of the DNA that sit in regions between protein-coding genes. But the designation is becoming less meaningful as scientists discover that these regions yield RNAs with important tasks.

Noncoding  Messenger RNAs are decoded in a cell’s ribosome to make proteins; “noncoding” refers to RNAs that do not end up as codes for proteins.

RNA   RNA, for “ribonucleic acid,” is a chemical cousin to DNA, created from the DNA template in a process known as transcription. DNA is made up of the chemical subunits adenine (A), thymine (T), cytosine (C) and guanine (G). RNA substitutes uracil (U) for the T.

Tina Hesman Saey is the senior staff writer and reports on molecular biology. She has a Ph.D. in molecular genetics from Washington University in St. Louis and a master’s degree in science journalism from Boston University.