August 8, 2025 | Ronald Fritz

Statistical Significance: An Abused Standard

The pursuit of statistical
significance incentivizes
gaming the system for fame

Significance Over Truth
How Science’s Incentives Are Breaking Research

by Ronald Fritz, PhD

In 2019, headlines warned: “Egg consumption linked to higher risk of heart disease and death.”
By 2025, another “groundbreaking” study declared eggs don’t harm heart health.

In between, we’ve seen eggs praised as an anti-cancer superfood (2020), condemned as cancer-linked (2022), exonerated from diabetes risk (2018), and then accused of causing it again (2020).

These contradictions aren’t confined to breakfast. Across medicine, biology, psychology, and more, studies are producing results that don’t hold up. Are these simply examples of science correcting itself over time—or are they symptoms of a deeper dysfunction that’s eroding trust in research?

As a career statistician, I stand with the growing number of scientists who say it’s the latter. Here is a recent expression of concern:

The psychological burden of statistical significance: editorial reflections from 2015 to 2025. (PenSoft Blog. 3 Aug 2025).

“Significance”

Michal Ordak, in his July 31, 2025, article The Psychological Burden of Statistical Significance in Academic Publishing, writes:

“The pursuit of significance is no longer just a technical issue, but a psychological burden that shapes behavior, distorts judgment, and affects mental well-being.”

In research, “statistical significance” is the odds-based method used to tell whether a finding is likely real or just chance. When a study claims p = 0.05, it’s essentially saying, “We’re 95% confident this is real.”

Since the early 20th century, when Sir Ronald Fisher formalized it, statistical significance has been central to research credibility. Journals prefer it. Publishers know their readers trust it. Researchers know their careers depend on it.

Ordak points out that many scientists admit they worry non-significant results will hurt their chances of publication. This pressure leads to a dangerous cycle: researchers shape their methods—sometimes inappropriately—to produce significant results.

Some even defend flawed analysis in peer review simply because it “got significant results.” This isn’t just cutting corners. It’s gaming the system.

the more sensational your finding, the more attention you get—even if it’s wrong.

When Science Can’t Check Itself

If science’s self-correction mechanism is working, failed replications should be rare. But the numbers tell a darker story:

Cancer research: Amgen tried to replicate 53 high-profile cancer studies—only 6 held up. Bayer reported similar results in 2011, failing to replicate 75% of cancer biology studies they tested. (Naik, 2011) (Errington et al, 2021)

Psychology: Of 307 prominent findings, only 64% replicated at all—and those that did showed effect sizes about a third smaller than first reported. (Nosek et al, 2021)

Neurology & disease research: 100 potential ALS drugs that once showed promise failed entirely in repeat trials. In spinal cord injury research, only 2 of 12 replications validated the original findings—one weakly and one only under special circumstances. (Errington et al, 2021)
Overall: Machine learning analysis of 40,000 psychology papers suggested only 40% might replicate. John Ioannidis famously concluded, “Most claimed research findings are false.” (Youyou et al, 2023) (Ioannidis, 2005)

And here’s the kicker: A 2021 UC San Diego study (Serra-Garcia & Gneezy, 2021) found that papers that can’t be replicated are cited 153 times more than those that can—because they’re “interesting.” In other words, the more sensational your finding, the more attention you get—even if it’s wrong.

The Incentives Behind the Crisis

Why is this happening? The causes are many, but the most corrosive is incentive misalignment:

Publish or perish: Careers depend on churning out significant findings.
Sensationalism pays: Bold, headline-grabbing claims bring citations, grants, and fame.
Cultural spillover: In some fields—like origins research—sensational speculation has been normalized (think “Nebraska Man” from a single tooth, or the coelacanth, once hailed as a transitional fossil, now caught alive and unchanged). That mindset may be spreading to other areas.

The Center for Open Science’s Reproducibility Project summed it up:

“There is substantial evidence of how the present research culture creates and maintains dysfunctional incentives and practices that can reduce research credibility in general.” (Errington, 2021)

Recognizing the Problem

The replication crisis is now on the radar of major institutions. The White House’s “Make America Healthy Again” commission put addressing it at the top of its 10-step child health plan. NIH Director Jay Bhattacharya stated simply:

“Replicable, reproducible, and generalizable research must serve as the basis for truth.” (Plackett, 2025)

But fixing the system will take time. In the meantime, readers must adopt a new role: cautious evaluator of scientific claims.

How to Read Science in the Age of the Replication Crisis

When you see a headline-grabbing claim:

Check the Methods – Is there a clear “methods and materials” section?
Sample Size – Are the number of observations/participants reported, and is it large enough? (Small studies are cheaper, but often less reliable.)
Data Transparency – Is raw data shared, or at least averages with standard deviations?
Statistical Rigor – Does the paper explain how it determined the distribution of results and chose its analysis methods?
Study Power – Was the study designed beforehand to account for expected variability?
P-Values – Are they reported, and are they below conventional thresholds?
Censoring – Was any data dropped, and if so, is there a valid reason?

Studies that check more of these boxes are more likely to reflect reality.

The Stakes

The replication crisis isn’t just an academic squabble—it threatens science’s very foundation. As statistician Larry Hedges warns, it raises questions “not just about research practices and methods, but the very reliability of scientific results.” If left unaddressed, it could become “an existential crisis for science.” (Northwestern University, Institute for Policy Research, 2024)

The path forward is clear: return to science’s first duty—the pursuit of truth, uncorrupted by bias, shortcuts, and institutional pressure.

Max Planck put it best:

“Science cannot solve the ultimate mystery of nature. And that is because, in the last analysis, we ourselves are a part of the mystery that we are trying to solve.”

References

(2025). The psychological burden of statistical significance: editorial reflections from 2015 to 2025. PenSoft Blog. 8/3/25. https://blog.pensoft.net/2025/07/31/the-psychological-burden-of-statistical-significance-in-academic-publishing/

Errington, T.M., Mathur, M., Soderberg, C.K., et al. (2021). Investigating the replicability of preclinical cancer biology. eLife, 10, e71601. https://doi.org/10.7554/eLife.71601 (2025). The psychological burden of statistical significance in academic publishing. Pensoft Blog.

Ioannidis. J., (2005). Why Most Published Research Findings Are False. PLOS Medicine 2: e124. DOI: https://doi. org/10.1371/journal.pmed.0020124

Naik, G. (2011). Scientists’ Elusive Goal: Reproducing Study Results. Wall St. Journal. 12/2/11. https://www.wsj.com/articles/SB10001424052970203764804577059841672541590

Northwestern University, Institute for Policy Research. (2024, February 28). ‘An existential crisis’ for science: Institute for Policy Research. https://www.ipr.northwestern.edu/news/2024/an-existential-crisis-for-science.html

Nosek BA, Hardwicke TE, Moshontz H, Allard A, Corker KS, Dreber A, Fidler F, Hilgard J, Struhl MK, Nuijten MB, Rohrer JM, Romero F, Scheel AM, Scherer LD, Schönbrodt FD, Vazire S. (2021). Replicability, Robustness, and Reproducibility in Psychological Science. Annual Review of Psychology 73: 114157. DOI: https://doi.org/10. 1146/annurev-psych-020821-114157

Ordak, M. (2025). The psychological burden of statistical significance: editorial reflections from 2015 to 2025. European Science Editing 51: e164741. https://doi.org/10.3897/ese.2025.e164741

Plackett, B. (2025, July 14). Amid White House claims of a research ‘replication crisis,’ scientists offer solutions. Chemical & Engineering News. Retrieved from https://cen.acs.org/research-integrity/reproducibility/Amid-White-House-claims-research/103/web/2025/06.

Serra-Garcia, M., & Gneezy, U. (2021). Nonreplicable publications are cited more than replicable ones. Science Advances, 7(21), eabd1705. https://www.science.org/doi/10.1126/sciadv.abd1705

Youyou, W., Yang, Y., & Uzzi, B. (2023). A discipline-wide investigation of the replicability of Psychology papers over the past two decades. Proceedings of the National Academy of Sciences, 120(8), e2208863120.

Watch and share this Short Reel about this article! Click to view. Join the fellowship!

Ronald D. Fritz, PhD, is a retired research statistician whose career spanned 27 years. Before entering the field of statistics, he worked as an engineer and engineering manager in the defense industry. He earned his doctorate in Industrial Engineering, with a minor in Mathematical Statistics, from Clemson University, where he was honored as a Dean’s Scholar.

Dr. Fritz served as a consulting statistician across a broad range of industries, culminating in a 12-year role as a global statistical resource at PepsiCo. During his time at PepsiCo, he led significant research on gluten contamination in oats and its relationship to celiac disease, publishing several articles on the subject.

In retirement, Dr. Fritz developed a deep interest in creation science, sparked by a visit to the Creation Museum in Petersburg, Kentucky. As he delved into the topic, he shared his findings with his pastor, which led to an invitation to speak at their church. This initial presentation opened the door to further speaking engagements at churches throughout the region.

Dr. Fritz has been married for 35 years to his wife, Mitzie. They live in the mountain community of Bee Log, North Carolina, within sight of the church where they were married and now worship. In his free time, Dr. Fritz tends a small chestnut orchard on their property, working to revive what was once a cherished local delicacy. The couple has two adult children.

(Visited 411 times, 1 visits today)

Tags: cancer, incentive, mathematics, methods, perverse incentive, psychology, publish or perish, reliability, replication crisis, reputation, rigor, RonaldFritz, significance, statistical significance, statistics, transparency

Categories: Media, Philosophy of Science, Politics and Ethics

Statistical Significance: An Abused Standard

Leave a Reply Cancel reply