Darwins Linux: Did Evolution Produce a Computer?
How is a cell like a computer? Some Yale scientists asked that question, and embarked on a project to compare the genome of a lowly bacterium to a computer’s operating system.1. Their work was published in PNAS.2 As with most analogies, some things were found to be similar, and some different – but in the end, these two entities might be more similar overall in important respects.
The interdisciplinary team, composed of members of the Computer Science department and the Molecular Biophysics and Biochemistry department, calls itself the Program in Computational Biology and Bioinformatics. Recognizing that “The genome has often been called the operating system (OS) for a living organism,” they decided to explore the analogy. For subjects, they took the E. coli bacterium, one of the best-studied prokaryotic cells, and Linux, a popular Unix-based operating system. The abstract reveals the basic findings, but there’s more under the hood:
To apply our firsthand knowledge of the architecture of software systems to understand cellular design principles, we present a comparison between the transcriptional regulatory network of a well-studied bacterium (Escherichia coli) and the call graph of a canonical OS (Linux) in terms of topology and evolution. We show that both networks have a fundamentally hierarchical layout, but there is a key difference: The transcriptional regulatory network possesses a few global regulators at the top and many targets at the bottom; conversely, the call graph has many regulators controlling a small set of generic functions. This top-heavy organization leads to highly overlapping functional modules in the call graph, in contrast to the relatively independent modules in the regulatory network. We further develop a way to measure evolutionary rates comparably between the two networks and explain this difference in terms of network evolution. The process of biological evolution via random mutation and subsequent selection tightly constrains the evolution of regulatory network hubs. The call graph, however, exhibits rapid evolution of its highly connected generic components, made possible by designers’ continual fine-tuning. These findings stem from the design principles of the two systems: robustness for biological systems and cost effectiveness (reuse) for software systems.
We see they have already concocted a curious mixture of designer language and evolution language. The design language continues in the heart of the paper. Design principles, optimization, constraints, frameworks, interconnections, information processing – these engineering phrases are ubiquitous. Consider this paragraph that starts with “master control plan.” They applied it not to Linux but to the cell, which is found to have many similarities to the master control plan of the computer operating system:
The master control plan of a cell is its transcriptional regulatory network. The transcriptional regulatory network coordinates gene expression in response to environmental and intracellular signals, resulting in the execution of cellular processes such as cell divisions and metabolism. Understanding how cellular control processes are orchestrated by transcription factors (TFs) is a fundamental objective of systems biology, and therefore a great deal of effort has been focused on understanding the structure and evolution of transcriptional regulatory networks. Analogous to the transcriptional regulatory network in a cell, a computer OS consists of thousands of functions organized into a so-called call graph, which is a directed network whose nodes are functions with directed edges leading from a function to each other function it calls. Whereas the genome-wide transcriptional regulatory network and the call graph are static representations of all possible regulatory relationships and calls, both transcription regulation and function activation are dynamic. Different sets of transcription factors and target genes forming so-called functional modules are activated at different times and in response to different environmental conditions. In the same way, complex OSs are organized into modules consisting of functions that are executed for various tasks.
And yet, on the other hand, the team felt that both the cell and Linux vary under processes of evolution:
Like biological systems, software systems such as a computer operating system (OS) are adaptive systems undergoing evolution. Whereas the evolution of biological systems is subject to natural selection, the evolution of software systems is under the constraints of hardware architecture and customer requirements. Since the pioneering work of Lehman, the evolutionary pressure on software has been studied among engineers. Interestingly enough, biological and software systems both execute information processing tasks. Whereas biological information processing is mediated by complex interactions between genes, proteins, and various small molecules, software systems exhibit a comparable level of complexity in the interconnections between functions. Understanding the structure and evolution of their underlying networks sheds light on the design principles of both natural and man-made information processing systems.
These paragraphs provide a flavor of the basic assumptions of the paper: that cells and OSs are analogous in their design principles and in their evolution. So what did they find? Their most eye-catching chart shows that Linux is top-heavy with master regulators and middle management functions, whereas a cell’s transcription network is bottom-heavy with workhorse proteins and few top management functions. The illustration has been reproduced in an article on PhysOrg with the interesting headline, “Scientists Explain Why Computers Crash But We Don’t.”
A table in the Discussion section of the paper summarizes the main similarities and differences they found. Here are some noteworthy examples:
- Cells are constrained by the environment; Linux by the hardware and customer needs.
- Cells evolve by natural selection; Linux evolves by designers’ fine-tuning.
- Cells have a pyramid-shaped hierarchy; Linux is top-heavy.
- Cells don’t reuse genes much, but Linux reuses function calls often.
- Cells don’t allow much overlap between modules, but Linux does.
- Cells have many specialized workhorses; Linux concentrates on generic functions.
- Cell evolutionary rates are mostly conservative; in Linux, they are conservative to adaptive.
- Cell design principles are bottom up; in Linux, they are top down.
- Cells are optimized for robustness; Linux is optimized for cost effectiveness.
The differences seem to be winning. Cells and Operating Systems have different constraints; therefore, they have different design principles and optimization. But not so fast; the team only studied a very lowly bacterium. What would happen if they expanded their study upward into the complex world of eukaryotes? Here’s how the paper ended:
Reuse is extremely common in designing man-made systems. For biological systems, to what extent they reuse their repertoires and by what means sustain robustness at the same time are questions of much interest. It was recently proposed that the repertoire of enzymes could be viewed as the toolbox of an organism. As the genome of an organism grows larger, it can reuse its tools more often and thus require fewer and fewer new tools for novel metabolic tasks. In other words, the number of enzymes grows slower than the number of transcription factors when the size of the genome increases. Previous studies have made the related finding that as one moves towards more complex organisms, the transcriptional regulatory network has an increasingly top-heavy structure with a relatively narrow base. Thus, it may be that further analysis will demonstrate the increasing resemblance of more complex eukaryotic regulatory networks to the structure of the Linux call graph.
1. An operating system is the foundational software on a computer that runs applications. A useful analogy is the management company for a convention center. It doesn’t run conventions itself, but it knows the hardware (exhibit halls, restrooms, lights, water, power, catering) and has the personnel to operate the facilities so that a visiting company (application) can run their convention at the center.
2. Yan, Fang, Bhardwaj, Alexander, and Gerstein, “Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks,” Proceedings of the National Academy of Sciences published online before print May 3, 2010, doi: 10.1073/pnas.0914771107.
This is a really interesting paper, because it illustrates the intellectual schizophrenia of the modern Darwinist in the information age. It might be analogous to a post-Stalin-era communist ideologue trying to recast Marxist-Leninist theory for the late 1980s, when the failures of collectivism have long been painfully apparent to everyone except the party faithful. With a half-hearted smile, he says, “So we see, that capitalism does appear to work in certain environments under different constraints; in fact, it may well turn out to be the final stage of the proletarian revolution.” Well, for crying out loud, then, why not save a step, and skip over the gulags to the promised land of freedom!
You notice that the old Darwin Party natural-selection ideology was everywhere assumed, not demonstrated. The analogy of natural selection to “customer requirements and designers’ fine-tuning” is strained to put it charitably; to put it realistically, it is hilariously funny. The authors nowhere demonstrated that robustness is a less worthy design goal than cost-effectiveness. For a cell cast into a dynamic world, needing to survive, what design goal could be more important than robustness? Linux lives at predictable temperatures in nice, comfortable office spaces. Its designers have to design for paying customers. As a result, “the operating system is more vulnerable to breakdowns because even simple updates to a generic routine can be very disruptive,” PhysOrg admitted. Bacteria have to live out in nature. A cost-effective E. coli is a dead E. coli. The designer did a pretty good job to make those critters survive all kinds of catastrophes on this planet. The PhysOrg article simply swept this difference into the evolutionary storytelling motor mouth, mumbling of the bacterial design, that “over billions of years of evolution, such an organization has proven robust.” That would be like our communist spin doctor alleging that the success of capitalism proves the truth of Marxist doctrine.
A simple bacterial genome shows incredibly successful design for robustness when compared to a computer operating system, albeit at the cost of low reuse of modules. But then the authors admitted the possibility that eukaryotes might well have achieved both robustness and modular reusability. That would make the comparison to artificial operating systems too close to call. If we know that Linux did not evolve by mutations and natural selection, then it is a pretty good bet that giraffes and bats and whales and humans did not, either. That should be enough to get Phillip Johnson’s stirring speech, “Mr. Darwin, Tear down this wall!” to stimulate a groundswell of discontent with the outmoded regime. May it lead to a sudden and surprising demise of its icons, and a new birth of academic freedom.