Claim of New Antifreeze Gene by Natural Selection Melts Under Analysis
A biochemist examines imaginative claims about natural selection creating a new gene.
Did a New Gene Arise by Darwinian Selection?
by Dr Ross Anderson*
Dr. Anderson critiques a paper by Zhuang et al., “Molecular mechanism and history of non-sense to sense evolution of antifreeze glycoprotein gene in northern gadids” (PNAS, 13 Feb 2019), which claims a new antifreeze gene in Arctic cod arose by ‘fortuitous’ chance events and natural selection.
“This paper is a concrete dissection of the process of a de novo gene birth that has conferred a vital adaptive function directly linked to natural selection.”
Summary: The authors report here the putative assembly (evolution) of a gene encoding an antifreeze glycoprotein (AFGP) in cod fish. Comparing DNA sequences of AFGP-producing gadids with sequences from non-AGFP-producing gadids the authors believe they have identified a sequence in the non-AFGP gadids that doesn’t appear to code for a protein, but with a little imagination can be made into a gene which encodes an AFGP.
Starting with Something, Not Nothing
So, let’s see, here we have a fully formed, functional AFGP gene. As such it has all the components necessary for expression and regulation. We want to try and understand just how this gene could have arisen from a non-coding sequence; i.e., how it arose de novo. The authors recognize that most genes supposedly have arisen from pre-existing genes, but that new genes from non-coding sequences is rare, but theoretically possible. Reported here is a possible example of such a de novo gene.
Examination of extant AFGPs reveals that they are typically composed of a short sequence of amino acids repeated many times; e.g., the tripeptide, Thr-Ala-Ala (threonine, alanine, alanine), repeated n times. Some of the Thr residues have been glycosylated; i.e., they have sugars added to a side-chain.
The paper is an interesting read in that it reveals the very creative imagination of some scientists.
Let’s use our creative imagination and see if we can construct a gene that encodes such a protein. First, we find some DNA sequences in gadids that are somewhat similar, but do not encode a protein; they are non-coding sequences. Compare these sequences with the target AFGP gene sequence and see what is needed to make these non-coding sequences match the target AFGP sequence. We notice that what we need is for the coding region to encode a polyprotein composed of the repeated amino acid sequence Thr-Ala-Ala. We find the DNA sequence GCA, which is an Ala codon, repeated 9 times to generate a 27-nucleotide (nt) sequence. In the target AFGP gene the Thr-Ala-Ala repeat is found bounded by two 27-nt (nucleotide) repeats. So, let’s propose that the sequence of 27 nucleotides (GCA repeated 9 times) in the “ancestral” sequence underwent a chance duplication event to generate two 27-nt repeats. Then propose a second chance duplication event to generate four such repeats.
Blindly Aiming for the Target
Examination of the target AFGP amino acid sequence shows we need a Thr codon in the midst of the 27-nt repeats. Suppose we substitute an ‘A’ for the ‘G’ in one of the GCA sequences. The new 9-nt sequence would then read ACAGCAGCA which, when translated, would generate the required Thr-Ala-Ala sequence. Now that we have the appropriate coding sequence, we need to have it repeated as the target amino acid sequence, Thr-Ala-Ala, is repeated many times. So, let’s propose that this new 9-nt sequence is also duplicated many times.
We also notice that in the functional AFGP gene there are scattered arginine and glutamine codons. Presumably when translated these will provide signals for a protease to hydrolyze the polypeptide into many shorter peptides which are the antifreeze proteins. To generate these two amino acid codons in a sea of GCA codons would require at a minimum two strategically placed substitutions in a GCA codon to generate either an arginine codon or a glutamine codon. The authors provide no explanation for how these substitutions nicely provide codons for only these two amino acids when codons for many other amino acids could be generated more easily with just one substitution. The magic is all done by chance and natural selection in six fortuitous steps. Here is the summary diagram of their model from the paper:
Evolutionary mechanism of the gadid AFGP gene from noncoding DNA. The color codes of the sequence components follow Fig. 1. (A) The ancestral noncoding DNA contained latent signal peptide-coding exons with a 5′ Kozak motif, adjacent to a duplication-prone 27-nt GCA-rich sequence. (B) The 27-nt GCA(Ala)-rich sequence duplicated forming four tandem copies. (C) A 9-nt in the midst of the four 27-nt duplicates became the three codons for one AFGP Thr-Ala-Ala unit and underwent microsatellitelike duplication forming a proto-ORF. A proximal upstream regulatory region acquired through a putative translocation event. (E) A 1-nt frameshift led to a contiguous SP, a propeptide, and a Thr-Ala-Ala-like cds in a read-through ORF [open reading frame]. (F) Intragenic (Thr-Ala-Ala)n cds amplification, fulfilling the antifreeze function under natural selection.
Signalling the Blind
Now we have essentially constructed the coding region, but we’re not finished yet. All AFGPs are secreted proteins and as such require that a signal peptide sequence be attached to the coding region. The DNA sequence which encodes this signal has to be added to our coding sequence. It just so happens that the “ancestral” sequence has such a sequence in just the correct place with respect to the coding region. Only one problem, there is a single nucleotide which does not allow the signal sequence to be in the proper reading frame to the coding region. Let’s propose that this one nucleotide was deleted causing a reading frame-shift which then directly linked the signal sequence with the coding sequence. Interestingly, the authors indicate that DNA region encoding the signal sequence somehow develops an intron-exon structure, but no explanation is given.
Promoting Chance to Commander of the Fortuitous
We’re still not finished. We need a promoter region added to our new gene. Without a proper promoter, a gene will not be transcribed into an mRNA, and thus no protein product. The authors propose that “fortuitously”, a translocation event took place and the promoter from another gene was added in the exact location where needed. This proposed translocation also provided a 5′-untranslated region (UTR) in perfect alignment with the signal sequence as to save the reading frame generated earlier.
All functional AFGP genes also have a 3′-UTR. One species of gadid lacks a 3′-UTR and is a putative pseudogene, suggesting that sequences in the 3′-UTR are necessary for expression. In the paper the authors add a 3′-UTR, but do not explain where it came from as it is not found in the putative ancestral sequence.
Imaginative Storytelling Masquerading as Science
The paper is an interesting read in that it reveals the very creative imagination of some scientists. It is likened to examining a functioning car noting was is needed and going to a junkyard and auto parts store looking for all the pieces that could be cobbled together to make another functional car.
Now, sequence comparisons can provide valuable information, however, when trying to construct an evolutionary story of how a gene might have arisen de novo, one must bear in mind that all is hypothetical as there is no real way to experimentally test the conclusions of such hypothesizing. In short, it amounts to just storytelling.
The authors may have been able to come up with a more realistic explanation for the loss of function rather than try to explain a gain in function, but that would not have been as interesting or as much fun, and it certainly would not have been published.
Note: for those interested in this subject, Dr. Cornelius Hunter, another biochemist, has critiqued the PNAS paper at Evolution News & Science Today. A related, but different claim about gene evolution from 2011 was analyzed by Shaun Doyle in the Journal of Creation (CMI).
*About the author
Dr. Anderson’s expertise is in the area of biochemistry and molecular biology. He has taught Biochemistry and helped to direct research projects of graduate and medical students at Baylor College of Medicine, Houston, TX. Dr. Anderson was a post-doctoral researcher in the Molecular Genetics Division of the Department of Ophthalmology at the Houston Neurosensory Center.
Dr. Anderson was a member of both the undergraduate and graduate faculty at Lamar University, Beaumont, TX. There he taught and directed the research activities of undergraduates and Masters of Science degree candidates in Biology. Currently he is professor of biochemistry at The Master’s University in southern California.
Dr. Anderson’s research interests include structure-function studies of DNA polymerizing enzymes and the synthesis and expression of synthetic human genes in bacterial hosts. He has authored or co-authored several publications in major, peer-reviewed journals. He is a member of the American Chemical Society and Sigma Xi Research Society.
It’s an honor to have Dr Ross Anderson contribute this article to Creation-Evolution Headlines. Now, we end with a pictorial summary of Darwinian evolution by J. Beverly Greene.