Peer Review: Can You Trust a Scientific Journal Paper?
Science magazine has egg on its face – deviled, poached, and scrambled – everything but sunny side up. Last May, it printed one of the biggest breakthrough stories of the year in stem cell research: Korean scientist Woo Suk Hwang, a professor at the Seoul National University and President of the World Stem Cell Hub, a man called the “pride of Korea” and a “national treasure,” had cloned human stem cells (05/23/2005). The story began to unravel when rumors of fraud emerged in December; last week, Science admitted the paper was hopelessly flawed, and it all came crashing down this week. It’s all over the news now (see Fox News, News@Nature and, of course, Science): Hwang’s last hopeful claims about his human stem cell clones have been demolished – all of it was fakery, fraud and lies. Hwang had already stepped down from both positions and has been forever disgraced. Not only that; News@Nature called this a “huge setback for therapeutic cloning,” saying that “the field is now left with no evidence that it is possible in humans at all.”
How could such a huge fraud make its way past the rigors of peer review into one of the most prestigious scientific journals in the world? The editors of Science sheepishly tried to answer that question last week as they officially retracted the paper. While it is not uncommon for scientists to retract findings after further research, or to question the validity of others’ findings, there was a lot more than just ignorance or honest disputation in the Hwang case. In the magazine’s News of the Week section, Chong and Normile wrote like investigative reporters on a spy case, “How Young Korean Researchers Helped Unearth a Scandal.”1 Jennifer Couzin followed up with chapter two, “… And How the Problems Eluded Peer Reviewers and Editors.”2 She quoted another journal editor who explained, “Peer review doesn’t necessarily say that a paper is right. It says it’s worth publishing” (emphasis added in all quotes). Yet shouldn’t a journal’s standards exceed those of the New York Times? A big factor that seduced the reviewers was their eagerness to be first to publish a sensational, high-profile paper. Yet isn’t a scientific journal supposed to have better practices than a tabloid?
The buck stops at the top. Science chief editor Donald Kennedy had to accept responsibility on behalf of the journal, and explain what went wrong. In his “Editorial Expression of Concern,”3 Kennedy warned readers not to trust the Hwang results (this was before the full extent of the fraud had become known). In Couzin’s article, he defended the policy of aggressively seeking out high-profile “firsts” to publish, but said that changes would be put in place to try to prevent future fiascos. “Peer review cannot detect [fraud] if it is artfully done,” he cautioned, but the practice of requiring every author to describe their contribution to the paper, though “administratively complex,” might help, because “If the paper is wrong and has to be retracted, then everyone takes the fall.” Presumably peer pressure would augment peer review. He did not promise, however, that this new policy would be put into effect, and confessed it wouldn’t be foolproof anyway, because perpetrators could still be dishonest about their contributions.
Several uncomfortable findings about peer review came out of the investigation. All the peer reviewers gave enthusiastic reviews, even though a more careful eye should have seen that the only thing cloned were not stem cells, but faked photos, rotated and cropped to look real. In the rush to get the paper out, Science also failed another scientific principle: waiting to see the results replicated elsewhere. The images used by Hwang were never scrutinized carefully. The journal rushed the submitted paper to its Board of Reviewing Editors (two people) and gave them only 48 hours to decide whether to send it out for review. In addition, Couzin found out that these first reviewers don’t even look at the data: they are looking for “a mixture of novelty, originality and trendiness.” Shouldn’t a scientific journal have better standards than Rolling Stone? The actual reviewers (3) were given only a week, and admitted they weren’t all that careful. “you look at the data and do not assume it’s fraud,” one said.
Was this an isolated case, a rare slip-up in one journal? Consider a test of the peer review process. Couzin wrote,
Although the flaws in the Hwang paper were especially difficult for reviewers to catch, the peer-review system is far from foolproof, its supporters concede. In 1997, editors at the British Medical Journal (BMJ) described a study in which they inserted eight errors into a short paper and asked researchers to identify the mistakes. Of the 221 who responded, “the median number spotted was two,” says Richard Smith, who edited BMJ from 1991 until 2004. “Nobody spotted more than five,” and 16% didn’t find any.
Science did not commit to requiring authors in the future to detail their individual contributions to a research paper (Nature doesn’t require this, either). Nor were the editors specific about what they were going to do to clean house:
In the aftermath of the Hwang case, editors at Science will be having “a lot of conversations about how we can improve the evaluation of manuscripts,” says Kennedy. One thing unlikely to change is the aim of high-profile journals to publish, and publicize, firsts. “You want the exciting results, and sometimes the avant-garde exciting results don’t have the same amount of supporting data as something that’s been repeated over and over and over again,” says Katrina Kelner, Science’s deputy managing editor for life sciences. In weighing whether to publish papers such as these, “it’s always a judgment call,” she says.
But maybe that’s the issue: whose judgment, and by what standards? Couzin ended with a rejoinder that most scientists do not accept things as dogma until the results are replicated, despite the “hype” of exciting first pronouncements. “A culture that wanted to see things reproduced before making a big deal out of them would probably be a healthier culture.” From the tone of these articles, though, it doesn’t sound like reducing fat and doing more exercise are high on the New Year’s Resolution list.
1Sei Chong and Dennis Normile, “How Young Korean Researchers Helped Unearth a Scandal…”, Science, 6 January 2006: Vol. 311. no. 5757, pp. 22 – 25, DOI: 10.1126/science.311.5757.22.
2Jennifer Couzin, “… And How the Problems Eluded Peer Reviewers and Editors,” Science, 6 January 2006: Vol. 311. no. 5757, pp. 23 – 24, DOI: 10.1126/science.311.5757.23.
3Donald Kennedy, “Editorial Expression of Concern,” Science, 6 January 2006: Vol. 311. no. 5757, p. 36, DOI: 10.1126/science.311.5757.36b.
We don’t wish to be overly critical of peer review nor draw exaggerated conclusions from this one case. Peer reviewers, editors and scientists are all only human, and are like most of us: trying to work on many things simultaneously under time pressure, subject to mood swings and emotions, loathe to become bogged down in petty details, easily distracted, desirous of recognition and usually trusting of their peers. In general, there is safety in numbers. Independent eyes can catch errors in reasoning and constructively criticize unwarranted claims. Supporters of the peer review status quo can also claim that this fraud was eventually uncovered. See? Science is a self-correcting process (echoes of positivism). A few flaws get through, but at worst, peer review is like American government: awful, but better than any of the alternatives.
OK, granted, but look how the Darwin defenders use peer review as a selling point. For one thing, they trumpet all the thousands of papers on evolution, as if the more buckets of sand, the more solid the foundation. For another, they chide supporters of intelligent design for their shortage of peer-reviewed publications. Third, they rank the journals by prestige: Science and Nature win more points than the lower-profile journals (remember how Stephen Meyer’s peer-reviewed ID article was disparaged for being published in a “low-impact” journal? 09/08/2004). Clearly, many evolutionists treat peer review like a gold standard of scientific validation, an imprimatur of officiality and a badge of membership into elite scientific circles.
So now that we see how the sausage is made – the push to be first with sensational stories, the time deadlines, the lack of rigor, the assumptions about data being honest, and the ease with which mistakes get through – how much stress should be put on peer review? Are the more prestigious journals better at it? Is it the only standard?
Peer review can be a safeguard, but is no guarantee. In almost each interview in Current Biology, a working scientist is asked about peer review. The answer usually expresses problems with it and with the whole way scientific information is validated. They get angry that it can spill the beans of hard-bought original work to rivals. They doubt that the best findings get the prominence they deserve, or that the papers that get published are really all that significant (remember the “publish or perish” syndrome?) Authors squeeze in their names as contributors when maybe all they did was run the software application and record the numbers in a lab book. Among all the verbiage and charts and references and impressive equations, how much is really significant? Can really fundamental work never get noticed, because it never gets past the Board of Reviewers who are looking for “novelty, originality, and trendiness”? How about truth? A lot of Darwinian papers look novel and trendy, but like the tiresome papers on game theory or digital evolution, seem to have little or nothing to do with the real world. They amount to little more than trendy, original, novel twists on the art of just-so storytelling.
Recall also that peer review as a “touchstone of the modern scientific method” is a recent tradition (see Wikipedia). Though its roots date from the Royal Society in the 1700s, its implementation only became routine after World War II. Throughout history, even up to the present day, many of the most important discoveries have been announced in books with NO peer review. Copernicus, Newton, Boyle, Maxwell, and many others proposed the most earth-shaking scientific theories in book form out of their own creative genius. Darwin himself wrote The Origin of Species single-handed, with no PhD. Peer review these days often shuts out alternative viewpoints, like intelligent design, at the front door (12/21/2001). It is often rigged to perpetuate reigning paradigms and hinder original-thinking mavericks (02/06/2003). One can still publish original work without asking some distant editors owing allegiance more to Darwin than to Diogenes to cast judgment on it. What’s important in science is for an idea to be correct, not for it to pass the muster of a few other fallible humans, or to garner prized real estate in a prestigious rag. ID proponents may not have many peer-reviewed papers in journals yet, but they have published a great deal of original, scholarly research in groundbreaking books, such as Dembski’s The Design Inference (Cambridge University Press, 1998). Read good books and do more than just peer. Review.