Minimum evolution is a distance method employed in

phylogenetics In biology, phylogenetics () is the study of the evolutionary history of life using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among organisms based on empirical dat ...

modeling. It shares with maximum parsimony the aspect of searching for the phylogeny that has the shortest total sum of branch lengths. The theoretical foundations of the minimum evolution (ME) criterion lay in the seminal works of both Kidd and Sgaramella-Zonta (1971) and Rzhetsky and Nei (1993). In these frameworks, the molecular sequences from taxa are replaced by a set of measures of their dissimilarity (i.e., the so-called "evolutionary distances") and a fundamental result states that if such distances were unbiased estimates of the ''true evolutionary distances'' from taxa (i.e., the distances that one would obtain if all the molecular data from taxa were available), then the ''true phylogeny'' of taxa would have an expected length shorter than any other possible phylogeny T compatible with those distances.

Relationships with and comparison with other methods

Maximum parsimony

It is worth noting here a subtle difference between the maximum-parsimony criterion and the ME criterion: while maximum-parsimony is based on an abductive heuristic, i.e., the plausibility of the simplest evolutionary hypothesis of taxa with respect to the more complex ones, the ME criterion is based on Kidd and Sgaramella-Zonta's conjectures that were proven true 22 years later by Rzhetsky and Nei. These mathematical results set the ME criterion free from the

Occam's razor In philosophy, Occam's razor (also spelled Ockham's razor or Ocham's razor; ) is the problem-solving principle that recommends searching for explanations constructed with the smallest possible set of elements. It is also known as the principle o ...

principle and confer it a solid theoretical and quantitative basis. Similarly to ME, maximum parsimony becomes an

NP-hard In computational complexity theory, a computational problem ''H'' is called NP-hard if, for every problem ''L'' which can be solved in non-deterministic polynomial-time, there is a polynomial-time reduction from ''L'' to ''H''. That is, assumi ...

problem when trying to find the optimal tree (that is, the one with the least total character-state changes). This is why heuristics are often utilized in order to select a tree, though this does not guarantee the tree will be an optimal selection for the input dataset. This method is often used when very similar sequences are analyzed, as part of the process is locating informative sites in the sequences where a notable number of substitutions can be found. Maximum-parsimony criterion, which uses

Hamming distance In information theory, the Hamming distance between two String (computer science), strings or vectors of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number ...

branch lengths, was shown to be ''statistically inconsistent'' in 1978. This led to an interest in statistically consistent alternatives such as ME.

Neighbor joining

Neighbor joining may be viewed as a

greedy heuristic A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally ...

for the balanced minimum evolution (BME) criterion. Saito and Nei's 1987 NJ algorithm far predates the BME criterion of 2000. For two decades, researchers used NJ without a firm theoretical basis for why it works. While neighbor joining shares the same underlying principle of prioritizing minimal evolutionary steps, it differs in that it is a distance method as opposed to maximum parsimony, which is a character-based method. Distance methods like neighbor joining are often simpler to implement and more efficient, which has led to its popularity for analyzing especially large datasets where computational speed is critical. Neighbor joining is a relatively fast phylogenetic tree-building method, though its worst-case time complexity can still be ''O''(N³) without utilizing heuristic implementations to improve on this. It also considers varying rates of evolution across branches, which many other methods do not account for. Neighbor joining also is a rather consistent method in that an input distance matrix with little to no errors will often provide an output tree with minimal inaccuracy. However, using simple distance values rather than full sequence information like in maximum parsimony does lend itself to a loss of information due to the simplification of the problem.

Maximum Likelihood

Maximum likelihood contrasts itself with Minimum Evolution in the sense of Maximum likelihood is a combination of the testing of the most likely tree to result from the data. However, due to the nature of the mathematics involved it is less accurate with smaller datasets but becomes far less biased as the sample size increases, this is due to due to the error rate being 1/log(n). Minimal evolution is similar but it is less accurate with very large datasets. It is similarly powerful but overall much more complicated compared to UPGMA and other options.

UPGMA

UPGMA is a clustering method. It builds a collection of clusters that are then further clustered until the maximum potential cluster is obtained. This is then worked backwards to determine the relation of the groups. It specifically uses an arithmetic mean enabling a more stable clustering. Overall while it is less powerful compared to any of the other listed comparisons it is far simpler and less complex to create. Minimal Evolution is overall more powerful but also more complicated to set up, and is also NP hard.

Statistical consistency

The ME criterion is known to be statistically consistent whenever the branch lengths are estimated via the Ordinary Least-Squares (OLS) or via

linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...

. However, as observed in Rzhetsky & Nei's article, the phylogeny having the minimum length under the OLS branch length estimation model may be characterized, in some circumstance, by negative branch lengths, which unfortunately are empty of biological meaning. To solve this drawback, Pauplin proposed to replace OLS with a new particular branch length estimation model, known as ''balanced basic evolution'' (BME). Richard Desper and Olivier Gascuel showed that the BME branch length estimation model ensures the general statistical consistency of the minimum length phylogeny as well as the non-negativity of its branch lengths, whenever the estimated evolutionary distances from taxa satisfy the triangle inequality. Le Sy Vinh and Arndt von Haeseler have shown, by means of massive and systematic simulation experiments, that the accuracy of the ME criterion under the BME branch length estimation model is by far the highest in distance methods and not inferior to those of alternative criteria based e.g., on Maximum Likelihood or Bayesian Inference. Moreover, as shown by Daniele Catanzaro, Martin Frohn and Raffaele Pesenti, the minimum length phylogeny under the BME branch length estimation model can be interpreted as the (Pareto optimal) consensus tree between concurrent minimum entropy processes encoded by a forest of n phylogenies rooted on the n analyzed taxa. This particular information theory-based interpretation is conjectured to be shared by all distance methods in phylogenetics. Francois Denis and Olivier Gascuel proved that the Minimum Evolution principle is not consistent in weighted least squares and generalized least squares. They showed that there was an algorithm that could be used in OLS models where all weights are equal called EDGE_LENGTHS. In this algorithm the lengths of two edges, 1u and 2u can be computed without using distances δij(i,j≠1,2). This property does not hold in WLS models or in the GLS models. Without this property the ME principle is not consistent in the WLS and GLS models.

Algorithmic aspects

The "minimum evolution problem" (MEP), in which a minimum-summed-length phylogeny is derived from a set of sequences under the ME criterion, is said to be

.Rzhetsky, A., and Nei, M. (1993). Theoretical foundation of the minimum-evolution method of phylogenetic inference. ''Molecular Biology and Evolution'', 10(5), 1073–1095.Semple, C., and Steel, M. (2003). ''Phylogenetics''. Oxford University Press. The "balanced minimum evolution problem" (BMEP), which uses the newer BME criterion, is APX-hard. A number of exact algorithms solving BMEP have been described. The best known exact algorithm remains impractical for more than a dozen taxa, even with multiprocessing. There is only one approximation algorithm with proven error bounds, published in 2012. In practical use, BMEP is overwhelmingly implemented by heuristic search. The basic, aforementioned neighbor-joining algorithm implements a greedy version of BME. FastME, the "state-of-the-art", starts with a rough tree then improves it using a set of topological moves such as Nearest Neighbor Interchanges (NNI). Compared to NJ, it is about as fast and more accurate. FastME operates on the Balanced Minimum Evolution principle, which calculates tree length using a weighted linear function of all pairwise distances. The BME score for a given topology is expressed as: :

L = \sum_ w_ \cdot d_

where

d_

represents the evolutionary distance between taxa

i

and

j

, and

w_

is a topology-dependent weight that balances each pair’s contribution. This approach enables more accurate reconstructions than greedy algorithms like NJ. The algorithm improves tree topology through local rearrangements, primarily Subtree Prune and Regraft (SPR) and NNI operations. At each step, it checks if a rearranged tree has a lower BME score. If so, the change is retained. This iterative refinement enables FastME to converge toward near-optimal solutions efficiently, even for large datasets. Simplified pseudocode of FastME:

Input: Distance matrix D for n taxa
Output: Tree T minimizing BME score

1. Construct initial tree T using Neighbor Joining
2. Repeat until no improvement:
    a. For each possible SPR or NNI move:
        i. Apply the move to produce new tree T'
        ii. Compute BME score of T'
        iii. If score(T') < score(T), set T = T'
3. Return final tree T

Simulations reported by Desper and Gascuel demonstrate that FastME consistently outperforms NJ in terms of topological accuracy, particularly when evolutionary rates vary or distances deviate from strict additivity. It has also been successfully used on datasets with over 1,000 taxa.Desper, R., & Gascuel, O. (2005). The Minimum Evolution Distance-Based Approach to Phylogenetic Inference. In Gascuel, O. (Ed.), ''Mathematics of Evolution and Phylogeny''. Oxford University Press, pp. 1–28. Like most distance-based methods, BME assumes that the input distances are additive. When this assumption does not hold—due to noise, unequal rates, or other violations—the resulting trees may still be close to optimal, but accuracy can be affected. In addition to FastME,

metaheuristic In computer science and mathematical optimization, a metaheuristic is a higher-level procedure or heuristic designed to find, generate, tune, or select a heuristic (partial search algorithm) that may provide a sufficiently good solution to an op ...

methods such as genetic algorithms and simulated annealing have also been used to explore tree topologies under the minimum evolution criterion, particularly for very large datasets where traditional heuristics may struggle.Ali, R. A., Dehne, F., Rau-Chaplin, A., & Sack, J. R. (2009). A novel metaheuristic for constructing large phylogenetic trees. ''Proceedings of the 2009 ACM Symposium on Applied Computing'', 1080–1084.