Peer Reviewed Science Can Mislead in a Major Way
Sociology is under scrutiny, but the issues apply to all of science.
Is there a message in nothing? Yes, Jeffrey Mervis said in Science Magazine. When a scientist gets a null or empty result, that’s still a result. It should be announced, so that other scientists know what doesn’t work, not just what works. Publication of null results is valuable. It saves time by avoiding needless repetition. It also presents a more accurate picture of the world. As PhysOrg‘s headline by Bob Yirka reads, “lack of published null result papers skews reliability of those that are published.” That’s a serious charge. It means that published papers suffer credibility loss when null results are not shared.
The question of what to do with null results has plagued medical research, where people’s lives could be on the line. The Stanford team now found similar publication bias in social and behavioral sciences.
There’s a natural human bias to publish flashy results. Who wants to announce that nothing happened? File the results and move on to something interesting. Mervis explains:
Researchers have put numbers on the “file drawer” phenomenon, in which scientists abandon results that they believe journals are unlikely to publish.
In a study published online this week in Science, a team at Stanford University in Palo Alto, California, traced the publication outcomes of 221 survey-based experiments funded by the National Science Foundation. Nearly two-thirds of the social science experiments that produced null results, those that did not support a hypothesis, were simply filed away. In contrast, researchers wrote up 96% of the studies with statistically strong results.
Such practices can skew the literature and lead to wasteful duplication, the authors argue. Their remedy: Deposit all data and study designs into public registries….
Were it so simple. Critics argue that registering null results could create other biases. Null results are hard to interpret, one said. It would be burdensome, another complained. Journals are not interested in null results, another opined.
According to a widely-trusted standard, a P-value (probability that the experimental cause produced the observed effect) must be greater than 5% to be considered statistically significant.
Not unexpectedly, the statistical strength of the findings made a huge difference in whether they were ever published. Overall, 42% of the experiments produced statistically significant results. Of those, 62% were ultimately published, compared with 21% of the null results. However, the Stanford team was surprised that researchers didn’t even write up 65% of the experiments that yielded a null finding.
When null results are not published, the significance of published papers can appear inflated. The situation also encourages fraud by allowing unscrupulous researchers who got null results the first time to reduce their sample sizes in order to get a higher P-value.
In his writeup in Nature, Mark Peplow noticed that scientists are already aware of the problem. What the Stanford team has done is quantify the extent of the problem.
“When I present this work, people say, ‘These findings are obvious; all you’ve done is quantify what we knew anecdotally’,” says Malhotra [one of the study’s authors]. But social scientists often underestimate the magnitude of the bias, or blame journal editors and peer reviewers for rejecting null studies, he says.
Yirka hints that the credibility of science is on the line. If respected results only publish strong result papers, “an impression is created that only research that provides strong results is important, which of course is nonsense.” Perish the thought that the time-honored tradition of peer-reviewed journal publication is promoting nonsense!
The Stanford team advised that, in addition to a registry, scientists should participate in a “preanalysis plan” to announce up front what they expect to find. Critics of that idea think it shows an inherent distrust of scientists’ ability to design appropriate experiments. Their point is….?
A lesson from this revisited issue is that science cannot operate without integrity—a moral quality. Scientists are only human. You can’t eliminate the human element in science with a mechanical method. Here, we see that a process designed to reduce bias—peer review in qualified journals—might just encourage it.
The next question is what other sciences are prone to this null-publication problem besides medical science and social science. Maybe the question should be posed, “what science is not prone to publication bias?”
This is something to point out to the worshipers of scientism, who denigrate any activity that is not peer reviewed or published in a journal.
Next, we would like to require publication of all the null results in origin of life experiments and SETI projects. “Well, we mixed Martian meteorite dust with sea water in a simulated hydrothermal vent, and nothing happened.” “We tuned into a million frequencies from a hundred Kepler candidate earthlike planets, and heard nothing.” “We heated, dropped, centrifuged, and irradiated a can of Campbell’s primordial soup, and nothing crawled out.” Wouldn’t that put the Miller experiment into perspective?
It wouldn’t be surprising, we think, if they blasted the soup can with acid rock and something crawled out screaming, “Quit that awful racket!”