Tree rearrangements are deterministic algorithms devoted to search for optimal phylogenetic tree structure. They can be applied to any set of data that are naturally arranged into a tree, but have most applications in

computational phylogenetics Computational phylogenetics, phylogeny inference, or phylogenetic inference focuses on computational and optimization algorithms, Heuristic (computer science), heuristics, and approaches involved in Phylogenetics, phylogenetic analyses. The goal i ...

, especially in

maximum parsimony In phylogenetics and computational phylogenetics, maximum parsimony is an optimality criterion under which the phylogenetic tree that minimizes the total number of character-state changes (or minimizes the cost of differentially weighted charact ...

and

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...

searches of

phylogenetic tree A phylogenetic tree or phylogeny is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time.Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, MA. In ...

s, which seek to identify one among many possible trees that best explains the

evolution Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...

ary history of a particular

gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...

species A species () is often defined as the largest group of organisms in which any two individuals of the appropriate sexes or mating types can produce fertile offspring, typically by sexual reproduction. It is the basic unit of Taxonomy (biology), ...

Basic tree rearrangements

Image:NNI.svg, Nearest neighbor interchange (NNI) Image:SPR.svg, Subtree pruning and regrafting (SPR) Image:TBR.svg, Tree bisection and reconnection (TBR) The simplest tree-rearrangement, known as nearest-neighbor interchange, exchanges the connectivity of four subtrees within the main tree. Because there are three possible ways of connecting four subtrees, and one is the original connectivity, each interchange creates two new trees. Exhaustively searching the possible nearest-neighbors for each possible set of subtrees is the slowest but most optimizing way of performing this search. An alternative, more wide-ranging search, subtree pruning and regrafting (SPR), selects and removes a subtree from the main tree and reinserts it elsewhere on the main tree to create a new node. Finally, tree bisection and reconnection (TBR) detaches a subtree from the main tree at an interior node and then attempts all possible connections between edges of the two trees thus created. The increasing complexity of the tree rearrangement technique correlates with increasing computational time required for the search, although not necessarily with their performance. SPR can be further divided into uSPR: Unrooted SPR, rSPR: Rooted SPR. uSPR is applied to unrooted trees, and goes like this: break any edge. Join one end of the edge (selected arbitrarily) to any other edge in the tree. rSPR is applied to rooted trees*, and goes: break any edge except the edge leading to the root node. Join one end of the edge (specifically: the end of the edge that is FURTHEST from the root) and attach it to any other edge of the tree. * In this example the root of the tree is marked by a node of degree one, meaning that all nodes in the tree have either degree 1 or degree 3. An alternative approach, used in Bordewich and Semple, is to consider the root node to have degree 2, and to have a special rule for rSPR. The number of SPR or TBR moves needed to get from one tree to another can be calculated by producing a Maximum Agreement Forest comprising (respectively) rooted or unrooted trees. This problem is NP hard but Fixed Parameter Tractable.

Tree fusion

The simplest type of tree fusion begins with two trees already identified as near-optimal; thus, they most likely have the majority of their nodes correct but may fail to resolve individual tree "leaves" properly; for example, the separation ((A,B),(C,D)) at a branch tip versus ((A,C),(B,D)) may be unresolved. Tree fusion swaps these two solutions between two otherwise near-optimal trees. Variants of the method use standard

genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to g ...

s with a defined

objective function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost ...

to swap high-scoring subtrees into main trees that are high-scoring overall.

Sectorial search

An alternative strategy is to detach part of the tree (which can be selected at random, or using a more strategic approach) and to perform TBR/SPR/NNI on this sub-tree. This optimized sub-tree can then be replaced on the main tree, hopefully improving the p-score.{{cite journal , last1=Goloboff , first1=Pablo A. , date=1999 , title=Analyzing Large Data Sets in Reasonable Times: Solutions for Composite Optima , journal=Cladistics , volume=15 , issue=4 , pages=415–428 , doi=10.1006/clad.1999.0122, pmid=34902941 , doi-access=free

Tree drifting

To avoid entrapment in local optima, a 'simulated annealing' approach can be used, whereby the algorithm is occasionally permitted to entertain sub-optimal candidate trees, with a probability related to how far they are from the optimum.

Tree fusing

Once a range of equally-optimal trees have been gathered, it is often possible to find a better tree by combining the "good bits" of separate trees. Sub-groups with an identical composition but different topology can be switched and the resultant trees evaluated.

References

Phylogenetics Optimization algorithms and methods Trees (data structures)