Reproducibility Debunked as a Myth
Human choice about what data to
focus on biases results, says major study
The “reproducibility crisis” just got worser and worser (pardon the Wonderland lingo). A major test of reproducibility was hugely disappointing to the proponents of scientism—the belief that the scientific method is humanity’s best path to reliable knowledge.
Reproducibility, we are taught in school, is a hallmark of science. It’s one of the key traits that distinguishes knowledge from ignorance. For limited branches of science, reproducibility has a fairly good track record. In ballistics, for instance, one can predict the landing spot of a projectile to a high degree of accuracy. A second team, using identical equipment, samples and starting conditions, will likely get the same results. Many other experiments in physics and chemistry are usually reliable, with accuracies known to many significant figures. Even in these “hard science” cases, though, experimentalists know to expect a certain “acceptable” degree of error, and anomalies are sometimes explained away. A failure in an experiment might be explained as the difference between the real and the ideal. The predicted answer was based on equations derived by assuming idealized but unrealistic conditions, such as treating all the mass of a pendulum as concentrated at a point.
Biology is a much more squishy type of science. Exceptions multiply in biological and ecological studies. The reproducibility crisis (also called the replication crisis) became well known in the squishier social sciences: psychology, psychiatry, anthropology and sociology. Still, if science is to be granted the presumptive authority culture demands we endow on “settled science”, some degree of reproducibility in these fields should be demonstrable to the public. In this new study, announced by Nature on October 12, results were devastating. Biologists did not even confirm Murphy’s take on reproducibility: “Experiments should be reproducible; they should always fail the same way.” They didn’t even fail in the same way!
Reproducibility trial: 246 biologists get different results from same data sets (Nature News, 12 Oct 2023). Anil Oza points out that a “wide distribution of findings shows how analytical choices drive conclusions.” This means that experiments cannot be run mindlessly by a robot or machine. Human beings get in the way.
In a massive exercise to examine reproducibility, more than 200 biologists analysed the same sets of ecological data — and got widely divergent results. The first sweeping study of its kind in ecology demonstrates how much results in the field can vary, not because of differences in the environment, but because of scientists’ analytical choices.
“There can be a tendency to treat individual papers’ findings as definitive,” says Hannah Fraser, an ecology meta researcher at the University of Melbourne in Australia and a co-author of the study. But the results show that “we really can’t be relying on any individual result or any individual study to tell us the whole story”.
Like a cancer, this crisis is spreading from psychology to biology.
“This paper may help to consolidate what is a relatively small, reform-minded community in ecology and evolutionary biology into a much bigger movement, in the same way as the reproducibility project that we did in psychology,” he says. It would be hard “for many in this field to not recognize the profound implications of this result for their work”.
The results you get, in other words, may be a function of the particular researcher you ask. How does that differ from non-scientific fields, then? In biology, is a “fact” to be defined as a conclusion of the one who shouts the loudest, or has the most respected reputation or political power? How do you know?
There was every reason to expect commonality in the results, because the data sets were the same and the questions were simple, such as ““To what extent is the growth of nestling blue tits (Cyanistes caeruleus) influenced by competition with siblings?” or “How does grass cover influence Eucalyptus spp. seedling recruitment?” The best spin they could put on these embarrassing results was that “Despite the wide range of results, none of the answers are wrong… Rather, the spread reflects factors such as participants’ training and how they set sample sizes.” How does that help? Which results are trustworthy as “science” then? To call up another Wonderland reference, “Everybody has won and all must have prizes,” said the Dodo bird.
Peer Review Fail, Too
Another result of the study undermined the belief that peer review is a safeguard against error. Wrong: the peer reviewers didn’t agree among themselves on which studies were significant or not. One reviewer’s poor rating conflicted with another reviewer’s acceptable rating.
The authors also simulated the peer-review process by getting another group of scientists to review the participants’ results. The peer reviewers gave poor ratings to the most extreme results in the Eucalyptus analysis but not in the blue tit one. Even after the authors excluded the analyses rated poorly by peer reviewers, the collective results still showed vast variation, says Elliot Gould, an ecological modeller at the University of Melbourne and a co-author of the study.
If this “reform movement” to improve reproducibility in biology expands, Big Science will have even more reasons to despair the loss of trust in science by the public. That would certainly be one of the “profound implications” of the study.
Oh, but evolution is a fact. Evolution is science. Creation is religion. (Those talking points from the Darwin Party have survived almost a century of refutation, because the Darwinists have political clout.)
Did you catch the association between “ecology and evolutionary biology”? Ask why those two are often mentioned together, and whether they suffer from many of the same foibles.
The Blunderful Wizard of Flaws stumbles on, but beware: there is Malice in Blunderland. When you mix two fairy tales, you get an even bigger fairy tale.