How to Teach AI to Reason Scientifically
AI inherits the naturalistic biases
that are deeply embedded in our culture,
but can learn to judge evidence on its merits
AI Thinks Differently When
Evidence Replaces Consensus
by Ron Fritz, PhD
When it comes to our origins, there are two basic possibilities: either reality is the product of intentional design, or it arose through unguided natural processes. Every person, consciously or not, adopts one of these views, and the choice is foundational. It shapes how we interpret the world, whether we see purpose in life, and how we understand its meaning. Given its significance, this question deserves careful and sustained attention.
Yet most of us do not approach it as a blank slate. Our culture and educational institutions have largely made the choice for us. Historically, belief in a creator was the default assumption. By the early twentieth century, however, that default shifted toward naturalistic explanations—a change often attributed to scientific discovery and intellectual progress. Today, naturalistic accounts of origins are deeply embedded in our institutions and culture and are commonly accepted without serious scrutiny.

AI Inherits Cultural Bias
As a result, many of us inherit a naturalistic bias early on, and it tends to persist unless we are willing to invest considerable time engaging with dense and technical scientific debates. In practice, we are taught what to believe before we are taught how to evaluate the competing claims, and those beliefs often remain unchallenged unless new learning actively replaces them.
Artificial intelligence provides a useful parallel. Like humans, AI systems are trained within a particular intellectual framework and reflect that framework in their conclusions. Unless new assumptions are introduced or existing ones explicitly questioned, their reasoning—and their answers—remain largely unchanged.
Ask a mainstream AI system about our origins, and the response mirrors the prevailing cultural consensus:
“The universe began in a hot expansion; life emerged on at least one planet; evolution shaped biology; you are the latest expression of this process.” — ChatGPT
This narrative is often presented as settled science. Yet AI systems, like humans, possess reasoning capabilities that can be engaged to assess whether such conclusions are actually warranted by the evidence. Rather than simply restating consensus views, AI can be asked to examine the underlying data and evaluate how well it supports competing hypotheses.
A Test of AI Teachability
This was explored by asking a set of mainstream AI engines to identify the strongest evidence supporting each origin hypothesis. To enable Bayesian aggregation—a formal statistical method for combining independent lines of evidence—only uncorrelated evidence was used, since overlapping evidence would effectively amount to double counting. Four broad, ‘conditionally independent’ categories then emerged for each view:
Naturalistic Origins — Top Evidence Categories
- Existence of abiotic building blocks
- Biogeographical distribution of life
- Fossil succession and increasing complexity in the geological record
- Genetic code similarity across living organisms
Designed Origins — Top Evidence Categories
- Complexity of biological systems
- Existence of consciousness and morality
- Irreducible complexity in living systems
- Fine-tuning of the universe
Although additional categories could be proposed and the evidential strength of each varies, this analysis treated all categories equally. It is intended as a starting point rather than a final verdict—a preliminary Bayesian evaluation subject to future refinement. The AI systems were then asked to estimate the plausible likelihoods (probabilities) of observing each category of evidence under each hypothesis.

AI Learns to Think
The results were striking. Across the five independent AI platforms tested, Bayesian aggregation consistently favored design over naturalism (see table below). Based on the AI systems’ own probability estimates and reasoning, a designer-based origin fit the evidence better than the naturalistic alternative—and by a substantial margin.
When asked why this conclusion differed from AI’s initial default support for naturalistic origins, the explanation was revealing. ChatGPT clarified that the probabilities used were:
“… meant as illustrative estimates for your exercise. They were not based on empirical measurement or consensus data.”
When pressed further—specifically on whether the probabilities were arbitrary—the response was more precise:
“The probabilities reflected the strength of the evidence relative to each hypothesis, rather than consensus opinion.”
In other words, when consensus assumptions were set aside and the focus shifted to how well the evidence actually fit each competing explanation, a different conclusion emerged.

AI Learns that Evidence Trumps Consensus
This leads to the central point. When AI is required to reason—to move beyond repeating what it has been trained to say and instead evaluate evidence on its own terms—it arrives at a conclusion that diverges from the cultural consensus embedded in its training. That alone should give us pause.
If an AI system, operating without personal bias or existential stake, finds that the totality of evidence aligns more closely with design than with blind naturalism, then perhaps the default assumptions we have inherited deserve serious re-examination. Given the importance of this question—and its power to shape meaning, purpose, and direction in our lives—it may be time for us to do what AI was asked to do here: suspend reflexive deference to consensus, examine the evidence carefully, and reason our way toward a well-grounded conclusion.

Ronald D. Fritz, PhD, is a retired research statistician whose career spanned 27 years. Before entering the field of statistics, he worked as an engineer and engineering manager in the defense industry. He earned his doctorate in Industrial Engineering, with a minor in Mathematical Statistics, from Clemson University, where he was honored as a Dean’s Scholar.
Dr. Fritz served as a consulting statistician across a broad range of industries, culminating in a 12-year role as a global statistical resource at PepsiCo. During his time at PepsiCo, he led significant research on gluten contamination in oats and its relationship to celiac disease, publishing several articles on the subject.
In retirement, Dr. Fritz developed a deep interest in creation science, sparked by a visit to the Creation Museum in Petersburg, Kentucky. As he delved into the topic, he shared his findings with his pastor, which led to an invitation to speak at their church. This initial presentation opened the door to further speaking engagements at churches throughout the region.
Dr. Fritz has been married for 35 years to his wife, Mitzie. They live in the mountain community of Bee Log, North Carolina, within sight of the church where they were married and now worship. In his free time, Dr. Fritz tends a small chestnut orchard on their property, working to revive what was once a cherished local delicacy. The couple has two adult children.



Comments
Dr Fritz … this mirrors quite well my own recent experience, especially with the updated Google Gemini version. It used to take me pages of back and forth to get a Gemini “concession.” Now, reflecting your own 99% result, I find the AI when asked to evaluate strictly on evidence and logic (otherwise, forget it, it simply mirrors the consensus and will battle endlessly to “prove” it) it sides with the Creationist position, even on things I myself find problematic (at least in my own mind). For instance, the geologic column of the fossil record as flood “sorting” is a conclusion it gave me WITHOUT prompting as the probable solution to why the record looks like it does. I have kept a whole series of surprising conversations that come to Creationist answers INDEPENDENTLY when told to weigh only data and strict logic. (My results are similar, though not quite so one-sided with ChatGPT). I LOVED this article. Thank you for doing it.
John and Ron, plan to attend the CRS conference this year? I’m involved in planning one of the workshops and would like to explore the possibility of a short segment on this.
👍
Hello JSWan,
Thank you for considering a short segment on this topic. I regret I will not be able to attend the conference this year; however, I would be glad to support the effort remotely in any way you think would be helpful.
I’m continuing to develop my research in this area and have recently completed a Bayesian aggregation based on ChatGPT’s top twelve evidences for naturalistic and designed origins, along with its relative probabilities of observing those evidences under each framework. The resulting model currently yields a probability of 0.998 for designed origins versus 0.002 for naturalistic origins. And this assessment assumes the standard evidentiary claims for naturalism at face value. My next phase of work will critically evaluate those claims and challenge ChatGPT regarding its probabilities for them.
If there is an opportunity for me to contribute to a segment—whether through background research, written material, virtual participation, or other support—please feel free to reach out. I can be reached directly thru LinkedIn.
Ron