# Darwinian Phylogenetic Tools Are Mathematically Flawed

Many evolutionists use software tools to construct evolutionary trees from genetic data. Two mathematicians have just reported in *Science*^{1} that several popular “tree-building” algorithms can give misleading results:

Markov chain Monte Carlo (MCMC) algorithms(Emphasis added in all quotes.)play a critical rolein theBayesianapproach tophylogenetic inference. We present a theoretical analysis of the rate of convergence ofmany of the widely used Markov chains. ForNcharacters generated from a uniform mixture of two trees, weprovethat the Markov chains take an exponentially long (inN) number of iterations to converge to the posterior distribution. Nevertheless, thelikelihood plotsfor sample runs of the Markov chainsdeceivingly suggest that the chains converge rapidly to a unique tree. Our results rely on novel mathematical understanding of the log-likelihood function on the space of phylogenetic trees. The practical implications of our work are thatBayesian MCMC methods can be misleadingwhen the data are generated from a mixture of trees. Thus, in cases of data containing potentiallyconflicting phylogenetic signals, phylogenetic reconstruction should be performed separately on each signal.

Will this workaround cure all problems, though? Only for small data sets – maybe. The more data, the more impossible the task becomes:

Forsmall treesone canhopetoovercomethe slow convergence by using multiple starting states. However, mixtures coming from large trees may contain multiple species subsets where one tree hasT1 as an induced subtree and the other hasT2. If there areksuch subsets, then about 15random starting points will be needed.^{k}Thus if there are 10 disagreement subsets, then 15in order to sample from the posterior distribution.^{10}random starting points will be needed

That’s over 576 billion. Most tree-building programs try to take shortcuts around the computational hurdles, but these mathematicians are not sure that the heuristic algorithms used are successful in avoiding assumptions that could bias the results. Their paper has proven one way the results can be misleading. Are there others?

In our setting, BMCMC [Bayesian Markov-Chain Monte Carlo] methodsfail in a clearly demonstrable manner. Weexpect that there is a more general classof mixtures where BMCMC methodsfail in more subtle ways. Thesesubtle failures may occur for many real-world exampleswhere the Markov chains quickly converge to some distribution other than the desired posterior distribution. Users of BMCMC methods should ideally avoid mixture distributions that are known to produce degenerate behavior in various phylogenetic settings. A good practice is to decompose the data intononconflicting signalsand performphylogenetic reconstruction separatelyon each signal.Our work highlights important unresolved questions: how to verify homogeneity of genomic data andwhatphylogenetic methodscanefficiently deal with mixtures.

Thus, they leave some potential gaping loopholes unexplored.

^{1}Mossel and Vigoda, “Phylogenetic MCMC Algorithms Are Misleading on Mixtures of Trees,” Science, Vol 309, Issue 5744, 2207-2209, 30 September 2005, [DOI: 10.1126/science.1115493].

What this seems to say is that the method

mightwork on closely-related organisms, like species of snails, but the more you mix different types of organisms into a tree of common ancestry, the more the results of these popular methods will give misleading results. Even with the closely-related trees, though, how can one be sure that the answers might “fail in more subtle ways”? And how do we know that once the smaller trees are assembled, the algorithms won’t mislead horrendously in the final mix?

Most creationists would probably not have qualms about trees of closely-related “kinds” of animals, like cats for one, or dogs for another. It is the Darwinian assumption thateverythingis phylogenetically related – cats, pine trees, bacteria, sharks, petunias, turtles, mushrooms, senators – that causes the controversies.

Evolutionists often showcase the printouts from these programs in their scientific papers to lend an air of computational legitimacy to their theories (see the fallacy of statistics). Well, we warned you that evolutionists are bad at math (08/19/2005, 07/25/2002). The only illustration in Darwin’sOrigin of Specieswas a phylogenetic tree. Since then, tree-building has become a favorite pastime around the Darwin Temple gamerooms (10/22/2001, 06/13/2003). Impressive as the charts look to the uninformed, they hawk symbolism over substance. This fits Hawkins Theory of Scientific Progress (right sidebar).

After reading this article, and the links to previous ones, how do you feel about that NSF Tree of Life project costing $12 million in tax dollars? (10/30/2002). If you want a better Tree of Life, try God’s (search) – it’s free, it’s honest, and you don’t have to play Monte Carlo to find it.