RNA Research Uncovers a Previously Ignored Universe of Genetic Information
A slow revolution is occurring in the study of genetic information. Until recently, the only interesting items in DNA sequences were the genes – the genetic codes for proteins. Since these usually represented only a small fraction of an organism’s genome, it was assumed the rest of the material was “junk DNA” – sequences that were either mutated leftovers of real genes (pseudogenes), spacers (introns), nonsense strands, or regions that merely provided structural support for the more important genes.
Indications that something was wrong with this picture have arisen over the last few years. For one, geneticists were surprised to count only about 30,000 protein-coding genes in the human genome; more recent counts have dropped the number to 25,000. How could such a complex organism as a human being arise from such a small library of genetic information? Another clue was the mismatch between messenger RNAs and proteins. Messenger RNA (mRNA) is the transcript of the DNA template that carries the genetic information outside the nucleus of eukaryotic cells to a ribosome, where it is translated into the amino acid language of proteins. Scientists found that many mRNAs never got that far. Were they simply disassembled and recycled? A third clue was the discovery of vast quantities of small RNAs in the cell (10/26/2001). Some were found to apparently regulate the expression of genes; what did the others do? Additionally, the mystery of introns (09/03/2003), viewed as useless nonsense strands of DNA cut out of genes by spliceosomes (09/17/2004), deepened when some were shown to be remarkably conserved (05/27/2004) between primitive and advanced organisms, suggesting they had a function. Is it possible scientists have vastly underestimated the amount of information in the cell, like walking into a forest and assuming the only living things there are the trees? Perhaps a kind of “gene chauvinism” has masked the reality of a much higher order of complexity.
The cover story of the Sept. 2 issue of Science, “Mapping RNA Form and Function,” explored this question. Of the 18 articles about RNA and its functional role in the cell, here are a few glimpses of the emerging picture that is putting to rest the old notion – that biological information is comprised only of genes and proteins.
- Parallel universe: Guy Riddihough, in the introductory article,1 ventured into the “forest of RNA dark matter” and found a wonderland:
For a long time, RNA has lived in the shadow of its more famous chemical cousin DNA and of the proteins that supposedly took over RNA’s functions in the transition from the ‘RNA world’ [07/11/2002, 08/23/2005] to the modern one. The shadow cast has been so deep that a whole universe (or so it seems) of RNA—predominantly of the noncoding variety—has remained hidden from view, until recently….
The discovery that much of the mammalian genome is transcribed, in some places without gaps (so-called transcriptional “forests”), shines a bright light on this embarrassing plenitude: an order of magnitude more transcripts than genes…. Many of these noncoding RNAs … are conserved across species, yet their functions (if any) are largely unknown…. (Emphasis added in all quotes.)
As if that were not enough, he noted that “even the coding and base-paring capacity of RNA can be altered–by RNA editing, in which bases in the RNA are changed on the fly.” It appears there is much life in the forest than just the trees.
- Hidden infrastructure: Matthew W. Vaughn and Rob Martienssen2 discussed the probability that vast numbers of small RNAs (sRNA) may be essential for regulation of genes. Some of these micro-RNAs (miRNAs) and small interfering RNAs (siRNAs) have already been identified in gene regulation, but many more remain to be studied. In one plant, 1.5 million sRNAs composed of 75,000 unique sequences were recently found, suggesting that “many more genes may be under the control of sRNAs than had been previously imagined.” These noncoding RNAs, usually 20-something bases long, keep a bag of tricks up their sleeves:
They can direct cleavage of other transcripts and can also promote second-strand synthesis by RNA-dependent RNA polymerase (RdRP), resulting in dsRNAs [double-stranded RNA]. In addition, siRNAs are implicated in recruiting heterochromatic modifications that result in transcriptional silencing.
The authors mentioned several ways in which these sRNAs had escaped detection due to the methods used.
- Pseudo – Not: Vaughn and Martienssen also noted the relationship of sRNAs to pseudogenes. Once thought to be mutated relics of true genes because they often contain premature stop codons, pseudogenes might be sources for siRNAs that regulate the true genes they resemble: “they could act transitively on transcripts from paralogous protein-coding genes by promoting cleavage or interfering with translation,” they continued. “More than half of the pseudogene sRNAs matched sequences elsewhere in the genome, indicating that this may be the case and suggesting a mechanism for coordinated trans-acting regulation of closely related members of gene families.”
- What Are They There For? Now that we know large numbers of small RNAs exist, what do they do? John S. Mattick3 suggested that they are not “transcriptional noise,” but rather “constitute a critical hidden layer of gene regulation in complex organisms, the understanding of which requires new approaches in functional genomics.” This will be a big task, he warns. One study of one such small RNA found it acting as a scaffold for the assembly of protein complexes and for coordinating nuclear traffic, helping localize gene products to their correct subcellular compartments. This one case reveals “a new dimension of organizational control in cell biology and development,” and “illustrates the magnitude of the task that is in front of us, which may be an equal or greater challenge than that we already face in working out the biochemical function and biological role of all of the known and predicted proteins and their isoforms.” Since cataloging the human proteome is the next daunting task after deciphering the genome, this statement should put geneticists on notice.
- New Glasses Needed: One assumption guiding previous research was that if a sequence was “evolutionarily conserved” (i.e., largely unchanged from primitive to advanced organisms), this indicated it was probably functional. Mattick cast doubt on that assumption: “Notably, evolutionary conservation may not be a reliable signature of functional ncRNAs” [non-coding RNAs]. The conserved ones may act on many substrates, he noted, but non-conserved ones may have few and be less restrained to vary. Many ncRNAs, Mattock thinks, may be “evolving quickly” and escaping detection by methods that look for sequence conservation.
Here is another indication that “junk DNA” actually represents information we haven’t yet decoded:
It is also clear that the majority of the genomes of animals is indeed transcribed, which suggests that these genomes are either replete with largely useless transcription or that these noncoding RNA sequences are fulfilling a wide range of unexpected functions in eukaryotic biology. These sequences include introns (Fig. 1), which account for at least 30% of the human genome but have been largely overlooked because they have been assumed to be simply degraded after splicing. However, it has been shown that many miRNAs and all known small nucleolar RNAs in animals are sourced from introns (of both protein-coding and noncoding transcripts), and it is simply not known what proportion of the transcribed introns are subsequently processed into smaller functional RNAs. It is possible, and logically plausible, that these sequences are also a major source of regulatory RNAs in complex organisms.
That higher animals should run on “complex genetic programming” should “come as no surprise,” he concluded. It means, though, that “we may have seriously misunderstood the nature of genetic programming in the higher organisms by assuming that most genetic information is expressed as and transacted by proteins.” Truly we have embarked on a long road.
- Mt. Improbable Looms Higher: Jean-Michel Claverie4 echoed Mattock’s estimation of the task, saying it is “only recently that the sheer scale of the phenomenon” of functional non-coding RNA has been realized. He pointed to research on the mouse genome that half its “transcriptome” (the corpus of RNA transcribed from DNA) consists of non-coding RNA (ncRNA). He found a eureka moment: “These results provide a solution to the discrepancy between the number of (protein-coding) genes and the number of transcripts,” he wrote. Missing them has been an artifact of our methods. “Noncoding transcripts originating from intergenic regions, introns, or antisense strands have probably been right before our eyes for 8 years without having been discovered!”
- Prokaryotes Say Me, Too: Claverie doubted that the discovery of functional ncRNA is limited to eukaryotes: “The notion that transcription is limited to protein-coding genes is also being challenged in microbial systems.” He pointed to E. coli which contains many transcripts from intergenic sequences and antisense strands (i.e., transcribed from the opposite strand of DNA). His ending paragraph should humble Watson and Crick, who thought they had it figured out 50 years ago:
The intergenic, intronic, and antisense transcribed sequences that were once deemed artifactual are now a testimony to our collective refusal to depart from an oversimplified gene model. But what if transcription is even more complex? Could it, for instance, lead to mRNAs generated from two different chromosomes (Fig. 1)? A year ago, we would have immediately suspected such sequences as further artifacts arising from large-scale cDNA [complementary DNA, a strand that forms a template for mRNA] sequencing programs. But now? Perhaps it’s time to go back to the cDNA sequence databases and reevaluate the numerous unexpected objects they contain. Transcription will never be simple again, but how complex will it get?
- The Life and Times of mRNA: Melissa Moore5 provided a more whimsical view of the actors in the genetic play. Dismissing the simplistic “short obituary” of RNAs as simply a “central conduits in the flow of information from DNA to protein,” she wrote, “this dry and simplistic description captures nothing of the intricacies, intrigues, and vicissitudes defining the life history of even the most mundane mRNA. In addition, of course, some mRNAs lead lives that, if not quite meriting an unauthorized biography, certainly have enough twists and turns to warrant a more detailed nucleic acid interest story.” She offered a prècis for her novel, giving us a glimpse into the frenzy of activity in the life of mRNA:
We will follow the lives of eukaryotic mRNAs from the point at which they are birthed from the nucleus until they are done in by gangs of exonucleases lying in wait in dark recesses of the cytoplasm. Along the way, mRNAs may be shuttled to and from or anchored at specific subcellular locations, be temporarily withheld from the translation apparatus, have their 3′ ends trimmed and extended, fraternize with like-minded mRNAs encoding proteins of related function, and be scrutinized by the quality-control police.
It turns out the mRNA is not just a carrier of information, but a “posttranscriptional operon” with many roles in the cell. For instance, some RNAs bind with proteins to form messenger ribonucleoprotein particles (mRNPs): “Individual mRNP components can be thought of as adaptors that allow mRNAs to interface with the numerous intracellular machineries mediating their subcellular localization, translation, and decay, as well as the various signal transduction systems.” For a sampler, Moore listed a “cheat sheet” of 11 such mRNPs and their functions. Her article gave some up-close-and-personal vignettes of some of the players, personifying their birth, baptism (entry into the “transcriptionally active pool”), examination, recruitment, retirement, dispatch and burial.
Space does not permit delving into the other 13 articles that describe such things as RNA’s role in the ribosome, how RNA is recycled, and other interesting topics.6 These samples should suffice to show that the information content of the cell has probably been vastly oversimplified before now. Remarkably, some researchers are looking at this new universe of RNA regulation and seeing an evolutionary path leading back into the fog of prehistory. Since the leading origin-of-life theory is the so-called “RNA World” scenario, some are speculating about whether today’s small RNAs are relics of a lost world in which early RNAs shared the roles of genetic storage and catalysis. Readers are referred to earlier entries on RNA and the origin of life (07/11/2002, 08/23/2005) for further study.
Addendum: Genes themselves, too, may contain much more information than previously realized. Several articles recently hinted at how genetic information could vastly outstrip the mere gene count. One mechanism of compressing information on DNA is alternative splicing: the spliceosome, after removing the introns, apparently can rearrange the exons into multiple products in some cases, something like the way kids take Lego blocks and make a variety of machines out of them. Another possibility for information storage is the overlooked opposite DNA strand, or “antisense” strand. Even though it represents a “photographic negative” of the normal strand, some mRNAs can apparently read it and generate additional, different protein products from it. These and other mechanisms, such as frame-shifted transcription, the histone code, or the ability of mRNAs to join transcripts from different chromosomes, suggest that the information coded in genes is just the tip of a very large info-berg.
1Guy Riddihough, “In the Forests of RNA Dark Matter,” Science, Vol 309, Issue 5740, 1507, 2 September 2005, [DOI: 10.1126/science.309.5740.1507].
2Vaughn and Martienssen, “It’s a Small RNA World, After All,” Science, Vol 309, Issue 5740, 1525-1526, 2 September 2005, [DOI: 10.1126/science.1117805].
3John S. Mattick, “The Functional Genomics of Noncoding RNA,” Science, Vol 309, Issue 5740, 1527-1528, 2 September 2005, [DOI: 10.1126/science.1117806].
4Jean-Michel Claverie, “Fewer Genes, More Noncoding RNA,” Science, Vol 309, Issue 5740, 1529-1530 , 2 September 2005, [DOI: 10.1126/science.1116800].
5Melissa J. Moore, “From Birth to Death: The Complex Lives of Eukaryotic mRNAs,” Science, Vol 309, Issue 5740, 1514-1518, 2 September 2005, [DOI: 10.1126/science.1111443].
6For popular reports on these subjects, see EurekAlert #1, EurekAlert #2, EurekAlert #3 (the “software of life”), and a press release from U of Delaware.
Which theory – intelligent design or Darwinism – would have predicted this complexity? Is there any hint of an evolutionary sequence leading up to this highly-coordinated, quality-controlled, information-rich system? (Recall from the 08/23/2005 entry that RNA does not form readily in water, and is highly unstable; its presence in the cell is only made possible by stringent programmed operations with quality control.) The gap between a mythical “RNA World” and the living world of real functioning RNA in the cell could never have been wider. As the cloud cover lifts, the summit of Mt. Improbable stretches higher into the sky.
Darwinism had enough trouble explaining the 4-letter (G,C,A,T) triplet-codon genetic code. Simple Watson-Crick base pairing and the old one-gene one-enzyme principle, the so-called “Central Dogma” of genetics was taught as The Big Picture till we knew better. Now that junk DNA is out (07/15/2005), the whole cellular information flowchart appears as complex as that of a well-run city, where each employee has a role. Each information-rich molecule is born, lives an active life and is retired, as Moore personified it. It’s time for the Darwin Party to let go of the steering wheel and let the Intelligent Design community drive science out of the naturalistic rut it’s in. Knowing how to read the signs of intelligent causation, they can help get science back onto the freeway of enriched understanding (see 06/25/2005 entry and commentary).