Genetics: Alternate Reading Frames May Be Common
Imagine a book written in a language where there were no spaces, and every word was three letters long. Now imagine that you could get one story by starting at the first letter, and a different story by starting at the second letter, and another by starting at the third letter. That’s the situation with some genes in the genetic code. DNA can code for one protein in the first reading frame, but a different protein in an alternate reading frame. Since the DNA language has three nucleotide “letters” per codon “word,” and since the opposite strand has three more reading frames, there are potentially six reading frames per gene. How commonly are alternate reading frames used by an organism?
A paper in PLoS Computational Biology hints that there may be widespread examples of alternate reading frames (ARFs) in mammalian genomes. ARFs were thought to be rare in eukaryotes. An international team, using new statistical techniques, found 40 cases in the human genome, but says that this may be a significant underestimate, since their analysis was very conservative. Their author’s summary asks and answers why these alternate reading frames were not found before:
A textbook human gene encodes a protein using a single reading frame. Alternative splicing brings some variation to that picture, but the notion of a single reading frame remains. Although this is true for most of our genes, there are exceptions. Like viral counterparts, some eukaryotic genes produce structurally unrelated proteins from overlapping reading frames. The examples are spectacular (G-protein alpha subunit [Gnas1] or INK4a tumor suppressor), but scarce. The scarcity is anthropogenic in origin: we simply do not believe that dual-coding genes can occur in eukaryotes. To challenge this assumption, we performed the first genome-wide scan for mammalian genes containing alternative reading frames located out of frame relative to the annotated protein-coding region. Using a newly developed statistical framework, we identified 40 such genes. Because our approach is very conservative, this number is likely a significant underestimate, and future studies will identify more alternative reading frame-containing genes with fascinating biology.
They said there was an almost zero probability these ARFs were due to chance: in fact, one section of the paper is subtitled, “Dual Coding Is Virtually Impossible by Chance.” Finding so many ARFs was surprising, they said, because maintaining ARFs by natural selection is “costly” – i.e., mutations in one reading frame could disable the information in the alternate frame.
Often, the proteins that result from alternate reading frames are related to the same function or process in the cell. The researchers compared the well-known ARFs between humans, mice and some other mammals and found them to be highly conserved (i.e., unevolved).
1Chung, Wadhawan, Szklarczyk, Pond, and Nekrutenko, “A First Look at ARFome: Dual-Coding Genes in Mammalian Genomes,” Public Library of Science: Computational Biology May 18, 2007.
Try writing a message that could be read three different ways depending on which letter was the starting point. It is extremely difficult. If this turns out to be a common mechanism in genetics, it reveals an astonishing level of intelligent design. How, and why, would a blind process do such a thing? Notice how geneticists were not even looking for this amount of complexity because they did not believe it was possible.
This technique of “data compression” could expand the functional information of the genome significantly. ARF! The hunt is on. Sic the design community on this fascinating puzzle. They won’t be tied up and muzzled from announcing the return of the Master to biology.