Phylogenetic Signal
   HOME

TheInfoList



OR:

In
biology Biology is the scientific study of life and living organisms. It is a broad natural science that encompasses a wide range of fields and unifying principles that explain the structure, function, growth, History of life, origin, evolution, and ...
, phylogenetics () is the study of the
evolution Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...
ary
history of life The history of life on Earth traces the processes by which living and extinct organisms evolved, from the earliest emergence of life to the present day. Earth formed about 4.5 billion years ago (abbreviated as ''Ga'', for '' gigaannum'') and ...
using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among
organism An organism is any life, living thing that functions as an individual. Such a definition raises more problems than it solves, not least because the concept of an individual is also difficult. Many criteria, few of them widely accepted, have be ...
s based on empirical data and observed
heritable Heredity, also called inheritance or biological inheritance, is the passing on of Phenotypic trait, traits from parents to their offspring; either through asexual reproduction or sexual reproduction, the offspring cell (biology), cells or orga ...
traits of DNA sequences,
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
sequences, and morphology. The results are a
phylogenetic tree A phylogenetic tree or phylogeny is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time.Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, MA. In ...
—a diagram depicting the hypothetical relationships among the organisms, reflecting their inferred evolutionary history. The tips of a
phylogenetic tree A phylogenetic tree or phylogeny is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time.Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, MA. In ...
represent the observed entities, which can be living
taxa In biology, a taxon (back-formation from ''taxonomy''; : taxa) is a group of one or more populations of an organism or organisms seen by taxonomists to form a unit. Although neither is required, a taxon is usually known by a particular name and ...
or
fossil A fossil (from Classical Latin , ) is any preserved remains, impression, or trace of any once-living thing from a past geological age. Examples include bones, shells, exoskeletons, stone imprints of animals or microbes, objects preserve ...
s. A phylogenetic diagram can be rooted or unrooted. A rooted tree diagram indicates the hypothetical common ancestor of the taxa represented on the tree. An unrooted tree diagram (a network) makes no assumption about directionality of character state transformation, and does not show the origin or "root" of the taxa in question. In addition to their use for inferring phylogenetic patterns among taxa, phylogenetic analyses are often employed to represent relationships among genes or individual organisms. Such uses have become central to understanding
biodiversity Biodiversity is the variability of life, life on Earth. It can be measured on various levels. There is for example genetic variability, species diversity, ecosystem diversity and Phylogenetics, phylogenetic diversity. Diversity is not distribut ...
, evolution,
ecology Ecology () is the natural science of the relationships among living organisms and their Natural environment, environment. Ecology considers organisms at the individual, population, community (ecology), community, ecosystem, and biosphere lev ...
, and
genome A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...
s. Phylogenetics is a component of
systematics Systematics is the study of the diversification of living forms, both past and present, and the relationships among living things through time. Relationships are visualized as evolutionary trees (synonyms: phylogenetic trees, phylogenies). Phy ...
that uses similarities and differences of the characteristics of species to interpret their evolutionary relationships and origins. In the field of
cancer Cancer is a group of diseases involving Cell growth#Disorders, abnormal cell growth with the potential to Invasion (cancer), invade or Metastasis, spread to other parts of the body. These contrast with benign tumors, which do not spread. Po ...
research, phylogenetics can be used to study the clonal evolution of tumors and molecular
chronology Chronology (from Latin , from Ancient Greek , , ; and , ''wikt:-logia, -logia'') is the science of arranging events in their order of occurrence in time. Consider, for example, the use of a timeline or sequence of events. It is also "the deter ...
, predicting and showing how cell populations vary throughout the progression of the disease and during treatment, using whole genome sequencing techniques. Because cancer cells reproduce mitotically, the evolutionary processes behind cancer progression are quite different from those in sexually-reproducing species. These differences manifest in several areas: the types of aberrations that occur, the rates of
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, ...
, the high heterogeneity (variability) of tumor cell subclones, and the absence of
genetic recombination Genetic recombination (also known as genetic reshuffling) is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. In eukaryot ...
. Phylogenetics can also aid in drug design and discovery. Phylogenetics allows scientists to organize species and can show which species are likely to have inherited particular traits that are medically useful, such as producing biologically active compounds - those that have effects on the human body. For example, in drug discovery,
venom Venom or zootoxin is a type of toxin produced by an animal that is actively delivered through a wound by means of a bite, sting, or similar action. The toxin is delivered through a specially evolved ''venom apparatus'', such as fangs or a sti ...
-producing animals are particularly useful. Venoms from these animals produce several important drugs, e.g., ACE inhibitors and Prialt ( Ziconotide). To find new venoms, scientists turn to phylogenetics to screen for closely related species that may have the same useful traits. The phylogenetic tree shows venomous species of
fish A fish (: fish or fishes) is an aquatic animal, aquatic, Anamniotes, anamniotic, gill-bearing vertebrate animal with swimming fish fin, fins and craniate, a hard skull, but lacking limb (anatomy), limbs with digit (anatomy), digits. Fish can ...
, and related fish they may also contain the trait. Using this approach, biologists are able to identify the fish, snake and lizard species that may be venomous. In
forensic science Forensic science combines principles of law and science to investigate criminal activity. Through crime scene investigations and laboratory analysis, forensic scientists are able to link suspects to evidence. An example is determining the time and ...
, phylogenetic tools are useful to assess DNA evidence for court cases. Phylogenetic analysis has been used in criminal trials to exonerate or hold individuals. HIV forensics uses phylogenetic analysis to track the differences in HIV genes and determine the relatedness of two samples. HIV forensics have limitations, i.e., it cannot be the sole proof of transmission between individuals, and phylogenetic analysis which shows transmission relatedness does not indicate direction of transmission.


Taxonomy and classification

Taxonomy image:Hierarchical clustering diagram.png, 280px, Generalized scheme of taxonomy Taxonomy is a practice and science concerned with classification or categorization. Typically, there are two parts to it: the development of an underlying scheme o ...
is the identification, naming, and
classification Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...
of organisms. The Linnaean classification system developed in the 1700s by
Carolus Linnaeus Carl Linnaeus (23 May 1707 – 10 January 1778), also known after ennoblement in 1761 as Carl von Linné,#Blunt, Blunt (2004), p. 171. was a Swedish biologist and physician who formalised binomial nomenclature, the modern system of naming o ...
is the foundation for modern classification methods. Linnaean classification traditionally relied on the phenotypes or physical characteristics of organisms to group species. With the emergence of
biochemistry Biochemistry, or biological chemistry, is the study of chemical processes within and relating to living organisms. A sub-discipline of both chemistry and biology, biochemistry may be divided into three fields: structural biology, enzymology, a ...
, classifications of organisms are now often based on DNA sequence data or a combination of DNA and morphology. Many systematists contend that only
monophyletic In biological cladistics for the classification of organisms, monophyly is the condition of a taxonomic grouping being a clade – that is, a grouping of organisms which meets these criteria: # the grouping contains its own most recent co ...
taxa should be recognized as named groups. The degree to which classification depends on inferred evolutionary history differs depending on the school of taxonomy: phenetics ignores phylogenetic speculation altogether, trying to represent the similarity between organisms instead;
cladistics Cladistics ( ; from Ancient Greek 'branch') is an approach to Taxonomy (biology), biological classification in which organisms are categorized in groups ("clades") based on hypotheses of most recent common ancestry. The evidence for hypothesiz ...
(phylogenetic systematics) tries to reflect phylogeny in its classifications by only recognizing groups based on shared, derived characters ( synapomorphies); evolutionary taxonomy tries to take into account both the branching pattern and "degree of difference" to find a compromise between inferred patterns of common ancestry and evolutionary distinctness.


Inference of a phylogenetic tree

Usual methods of phylogenetic inference involve computational approaches implementing an optimality criterion and methods of parsimony, maximum likelihood (ML), and MCMC-based Bayesian inference. All these depend upon an implicit or explicit
mathematical model A mathematical model is an abstract and concrete, abstract description of a concrete system using mathematics, mathematical concepts and language of mathematics, language. The process of developing a mathematical model is termed ''mathematical m ...
describing the relative probabilities of character state transformation within and among the characters observed. Phenetics, popular in the mid-20th century but now largely obsolete, used distance matrix-based methods to construct trees based on overall similarity in morphology or similar observable traits, which was often assumed to approximate phylogenetic relationships. Neighbor Joining is a phenetic method that is often used for building similarity trees for DNA barcodes. Prior to 1950, phylogenetic inferences were generally presented as
narrative A narrative, story, or tale is any account of a series of related events or experiences, whether non-fictional (memoir, biography, news report, documentary, travel literature, travelogue, etc.) or fictional (fairy tale, fable, legend, thriller ...
scenarios. Such methods were often ambiguous and lacked explicit criteria for evaluating alternative hypotheses.


Impacts of taxon sampling

In phylogenetic analysis, taxon sampling selects a small group of exemplar taxa to infer the evolutionary history of a clade. This process is also known as stratified sampling or clade-based sampling. Judicious taxon sampling is important, given limited resources to compare and analyze every species within a diverse clade, and also given the computational limits of phylogenetic software. Poor taxon sampling may result in incorrect phylogenetic inferences. Long branch attraction, in which nonrelated branches are incorrectly grouped by shared, homoplastic nucleotide sites, is an theoretical cause for inaccuracy There are debates if increasing the number of taxa sampled improves phylogenetic accuracy more than increasing the number of genes sampled per taxon. Differences in each method's sampling impact the number of nucleotide sites utilized in a sequence alignment, which may contribute to disagreements. For example, phylogenetic trees constructed utilizing a more significant number of total nucleotides are generally more accurate, as supported by phylogenetic trees' bootstrapping replicability from random sampling. The graphic presented in ''Taxon Sampling, Bioinformatics, and Phylogenomics'', compares the correctness of phylogenetic trees generated using fewer taxa and more sites per taxon on the x-axis to more taxa and fewer sites per taxon on the y-axis. With fewer taxa, more genes are sampled amongst the taxonomic group; in comparison, with more taxa added to the taxonomic sampling group, fewer genes are sampled. Each method has the same total number of nucleotide sites sampled. Furthermore, the dotted line represents a 1:1 accuracy between the two sampling methods. As seen in the graphic, most of the plotted points are located below the dotted line, which indicates gravitation toward increased accuracy when sampling fewer taxa with more sites per taxon. The research performed utilizes four different phylogenetic tree construction models to verify the theory; neighbor-joining (NJ), minimum evolution (ME), unweighted maximum parsimony (MP), and maximum likelihood (ML). In the majority of models, sampling fewer taxon with more sites per taxon demonstrated higher accuracy. Generally, with the alignment of a relatively equal number of total nucleotide sites, sampling more genes per taxon has higher bootstrapping replicability than sampling more taxa. However, unbalanced datasets within genomic databases make increasing the gene comparison per taxon in uncommonly sampled organisms increasingly difficult.


History


Overview

The term "phylogeny" derives from the German , introduced by Haeckel in 1866, and the Darwinian approach to classification became known as the "phyletic" approach. It can be traced back to
Aristotle Aristotle (; 384–322 BC) was an Ancient Greek philosophy, Ancient Greek philosopher and polymath. His writings cover a broad range of subjects spanning the natural sciences, philosophy, linguistics, economics, politics, psychology, a ...
, who wrote in his '' Posterior Analytics'', "We may assume the superiority ceteris paribus ther things being equalof the demonstration which derives from fewer postulates or hypotheses."


Ernst Haeckel's recapitulation theory

The modern concept of phylogenetics evolved primarily as a disproof of a previously widely accepted theory. During the late 19th century,
Ernst Haeckel Ernst Heinrich Philipp August Haeckel (; ; 16 February 1834 – 9 August 1919) was a German zoologist, natural history, naturalist, eugenics, eugenicist, Philosophy, philosopher, physician, professor, marine biology, marine biologist and artist ...
's recapitulation theory, or "biogenetic fundamental law", was widely popular. It was often expressed as "
ontogeny Ontogeny (also ontogenesis) is the origination and development of an organism (both physical and psychological, e.g., moral development), usually from the time of fertilization of the ovum, egg to adult. The term can also be used to refer to t ...
recapitulates phylogeny", i.e. the development of a single organism during its lifetime, from germ to adult, successively mirrors the adult stages of successive ancestors of the species to which it belongs. But this theory has long been rejected. Instead, ontogeny evolves – the phylogenetic history of a species cannot be read directly from its ontogeny, as Haeckel thought would be possible, but characters from ontogeny can be (and have been) used as data for phylogenetic analyses; the more closely related two species are, the more apomorphies their embryos share.


Timeline of key points

* 14th century, ''lex parsimoniae'' ( parsimony principle), William of Ockam, English philosopher, theologian, and Franciscan friar, but the idea actually goes back to
Aristotle Aristotle (; 384–322 BC) was an Ancient Greek philosophy, Ancient Greek philosopher and polymath. His writings cover a broad range of subjects spanning the natural sciences, philosophy, linguistics, economics, politics, psychology, a ...
, as a precursor concept. He introduced the concept of Occam's razor, which is the problem solving principle that recommends searching for explanations constructed with the smallest possible set of elements. Though he did not use these exact words, the principle can be summarized as "Entities must not be multiplied beyond necessity." The principle advocates that when presented with competing hypotheses about the same prediction, one should prefer the one that requires fewest assumptions. * 1763,
Bayesian probability Bayesian probability ( or ) is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quant ...
, Rev. Thomas Bayes, a precursor concept. Bayesian probability began a resurgence in the 1950s, allowing scientists in the computing field to pair traditional Bayesian statistics with other more modern techniques. It is now used as a blanket term for several related interpretations of probability as an amount of epistemic confidence. * 18th century, Pierre Simon (Marquis de Laplace), perhaps first to use ML (maximum likelihood), precursor concept. His work gave way to the Laplace distribution, which can be directly linked to least absolute deviations. * 1809, evolutionary theory, '' Philosophie Zoologique,''
Jean-Baptiste de Lamarck Jean-Baptiste Pierre Antoine de Monet, chevalier de Lamarck (1 August 1744 – 18 December 1829), often known simply as Lamarck (; ), was a French naturalist, biologist, academic, and soldier. He was an early proponent of the idea that biologi ...
, precursor concept, foreshadowed in the 17th century and 18th century by Voltaire, Descartes, and Leibniz, with Leibniz even proposing evolutionary changes to account for observed gaps suggesting that many species had become extinct, others transformed, and different species that share common traits may have at one time been a single race, also foreshadowed by some early Greek philosophers such as
Anaximander Anaximander ( ; ''Anaximandros''; ) was a Pre-Socratic philosophy, pre-Socratic Ancient Greek philosophy, Greek philosopher who lived in Miletus,"Anaximander" in ''Chambers's Encyclopædia''. London: George Newnes Ltd, George Newnes, 1961, Vol. ...
in the 6th century BC and the atomists of the 5th century BC, who proposed rudimentary theories of evolution * 1837, Darwin's notebooks show an evolutionary tree * 1840, American Geologist Edward Hitchcock published what is considered to be the first paleontological "Tree of Life". Many critiques, modifications, and explanations would follow. * 1843, distinction between homology and analogy (the latter now referred to as homoplasy), Richard Owen, precursor concept. Homology is the term used to characterize the similarity of features that can be parsimoniously explained by common ancestry. Homoplasy is the term used to describe a feature that has been gained or lost independently in separate lineages over the course of evolution. * 1858, Paleontologist Heinrich Georg Bronn (1800–1862) published a hypothetical tree to illustrating the paleontological "arrival" of new, similar species. following the extinction of an older species. Bronn did not propose a mechanism responsible for such phenomena, precursor concept. * 1858, elaboration of evolutionary theory, Darwin and Wallace, also in Origin of Species by Darwin the following year, precursor concept. * 1866,
Ernst Haeckel Ernst Heinrich Philipp August Haeckel (; ; 16 February 1834 – 9 August 1919) was a German zoologist, natural history, naturalist, eugenics, eugenicist, Philosophy, philosopher, physician, professor, marine biology, marine biologist and artist ...
, first publishes his phylogeny-based evolutionary tree, precursor concept. Haeckel introduces the now-disproved recapitulation theory. He introduced the term "Cladus" as a taxonomic category just below subphylum. * 1893, Dollo's Law of Character State Irreversibility, precursor concept. Dollo's Law of Irreversibility states that "an organism never comes back exactly to its previous state due to the indestructible nature of the past, it always retains some trace of the transitional stages through which it has passed." * 1912, ML (maximum likelihood recommended, analyzed, and popularized by
Ronald Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
, precursor concept. Fisher is one of the main contributors to the early 20th-century revival of Darwinism, and has been called the "greatest of Darwin's successors" for his contributions to the revision of the theory of evolution and his use of mathematics to combine Mendelian genetics and
natural selection Natural selection is the differential survival and reproduction of individuals due to differences in phenotype. It is a key mechanism of evolution, the change in the Heredity, heritable traits characteristic of a population over generation ...
in the 20th century "modern synthesis". * 1921, Tillyard uses term "phylogenetic" and distinguishes between archaic and specialized characters in his classification system. * 1940, Lucien Cuénot coined the term "
clade In biology, a clade (), also known as a Monophyly, monophyletic group or natural group, is a group of organisms that is composed of a common ancestor and all of its descendants. Clades are the fundamental unit of cladistics, a modern approach t ...
" in 1940: "''terme nouveau de clade'' (''du grec κλάδοςç, branche'') new term clade (from the Greek word ''klado''s, meaning branch). He used it for evolutionary branching. * 1947, Bernhard Rensch introduced the term ''Kladogenesis'' in his German book ''Neuere Probleme der Abstammungslehre Die transspezifische Evolution,'' translated into English in 1959 as ''Evolution Above the Species Level'' (still using the same spelling)''.'' * 1949, Jackknife resampling, Maurice Quenouille (foreshadowed in '46 by Mahalanobis and extended in '58 by Tukey), precursor concept. * 1950, Willi Hennig's classic formalization. Hennig is considered the founder of phylogenetic systematics, and published his first works in German of this year. He also asserted a version of the parsimony principle, stating that the presence of amorphous characters in different species 'is always reason for suspecting kinship, and that their origin by convergence should not be presumed a priori'. This has been considered a foundational view of phylogenetic inference. * 1952, William Wagner's ground plan divergence method. * 1957, Julian Huxley adopted Rensch's terminology as "cladogenesis" with a full definition: "''Cladogenesis'' I have taken over directly from Rensch, to denote all splitting, from subspeciation through adaptive radiation to the divergence of phyla and kingdoms." With it he introduced the word "clades", defining it as: "Cladogenesis results in the formation of delimitable monophyletic units, which may be called clades." * 1960, Arthur Cain and Geoffrey Ainsworth Harrison coined "cladistic" to mean evolutionary relationship, * 1963, first attempt to use ML (maximum likelihood) for phylogenetics, Edwards and Cavalli-Sforza. * 1965 ** Camin-Sokal parsimony, first parsimony (optimization) criterion and first computer program/algorithm for cladistic analysis both by Camin and Sokal. ** Character compatibility method, also called clique analysis, introduced independently by Camin and Sokal (loc. cit.) and E. O. Wilson. * 1966 ** English translation of Hennig. ** "Cladistics" and "cladogram" coined (Webster's, loc. cit.) * 1969 ** Dynamic and successive weighting, James Farris. ** Wagner parsimony, Kluge and Farris. ** CI (consistency index), Kluge and Farris. ** Introduction of pairwise compatibility for clique analysis, Le Quesne. * 1970, Wagner parsimony generalized by Farris. * 1971 ** First successful application of ML (maximum likelihood) to phylogenetics (for protein sequences), Neyman. ** Fitch parsimony, Walter M. Fitch. These gave way to the most basic ideas of maximum parsimony. Fitch is known for his work on reconstructing phylogenetic trees from protein and DNA sequences. His definition of orthologous sequences has been referenced in many research publications. ** NNI (nearest neighbour interchange), first branch-swapping search strategy, developed independently by Robinson and Moore et al. ** ME (minimum evolution), Kidd and Sgaramella-Zonta (it is unclear if this is the pairwise distance method or related to ML as Edwards and Cavalli-Sforza call ML "minimum evolution"). * 1972, Adams consensus, Adams. * 1976, prefix system for ranks, Farris. * 1977, Dollo parsimony, Farris. * 1979 ** Nelson consensus, Nelson. ** MAST ( maximum agreement subtree)((GAS) greatest agreement subtree), a consensus method, Gordon. ** Bootstrap, Bradley Efron, precursor concept. * 1980, PHYLIP, first software package for phylogenetic analysis, Joseph Felsenstein. A free computational phylogenetics package of programs for inferring evolutionary trees ( phylogenies). One such example tree created by PHYLIP, called a "drawgram", generates rooted trees. This image shown in the figure below shows the evolution of phylogenetic trees over time. * 1981 ** Majority consensus, Margush and MacMorris. ** Strict consensus, Sokal and Rohlffirst computationally efficient ML (maximum likelihood) algorithm. Felsenstein created the Felsenstein Maximum Likelihood method, used for the inference of phylogeny which evaluates a hypothesis about evolutionary history in terms of the probability that the proposed model and the hypothesized history would give rise to the observed data set. * 1982 ** PHYSIS, Mikevich and Farris ** Branch and bound, Hendy and Penny * 1985 ** First cladistic analysis of eukaryotes based on combined phenotypic and genotypic evidence Diana Lipscomb. ** First issue of ''Cladistics.'' ** First phylogenetic application of bootstrap, Felsenstein. ** First phylogenetic application of jackknife, Scott Lanyon. * 1986, MacClade, Maddison and Maddison. * 1987, neighbor-joining method Saitou and Nei * 1988, Hennig86 (version 1.5), Farris ** Bremer support (decay index), Bremer. * 1989 ** RI (retention index), RCI (rescaled consistency index), Farris. ** HER (homoplasy excess ratio), Archie. * 1990 ** combinable components (semi-strict) consensus, Bremer. ** SPR (subtree pruning and regrafting), TBR (tree bisection and reconnection), Swofford and Olsen. * 1991 ** DDI (data decisiveness index), Goloboff. ** First cladistic analysis of eukaryotes based only on phenotypic evidence, Lipscomb. * 1993, implied weighting Goloboff. * 1994, reduced consensus: RCC (reduced cladistic consensus) for rooted trees, Wilkinson. * 1995, reduced consensus RPC (reduced partition consensus) for unrooted trees, Wilkinson. * 1996, first working methods for BI (Bayesian Inference) independently developed by Li, Mau, and Rannala and Yang and all using MCMC (Markov chain-Monte Carlo). * 1998, TNT (Tree Analysis Using New Technology), Goloboff, Farris, and Nixon. * 1999, Winclada, Nixon. * 2003, symmetrical resampling, Goloboff. * 2004, 2005, similarity metric (using an approximation to Kolmogorov complexity) or NCD (normalized compression distance), Li et al., Cilibrasi and Vitanyi.


Uses of phylogenetic analysis


Pharmacology

One use of phylogenetic analysis involves the pharmacological examination of closely related groups of organisms. Advances in
cladistics Cladistics ( ; from Ancient Greek 'branch') is an approach to Taxonomy (biology), biological classification in which organisms are categorized in groups ("clades") based on hypotheses of most recent common ancestry. The evidence for hypothesiz ...
analysis through faster computer programs and improved molecular techniques have increased the precision of phylogenetic determination, allowing for the identification of species with pharmacological potential. Historically, phylogenetic screens for pharmacological purposes were used in a basic manner, such as studying the
Apocynaceae Apocynaceae (, from '' Apocynum'', Greek for "dog-away") is a family of flowering plants that includes trees, shrubs, herbs, stem succulents, and vines, commonly known as the dogbane family, because some taxa were used as dog poison. Notable mem ...
family of plants, which includes alkaloid-producing species like Catharanthus, known for producing vincristine, an antileukemia drug. Modern techniques now enable researchers to study close relatives of a species to uncover either a higher abundance of important bioactive compounds (e.g., species of Taxus for taxol) or natural variants of known pharmaceuticals (e.g., species of ''Catharanthus'' for different forms of vincristine or vinblastine).


Biodiversity

Phylogenetic analysis has also been applied to biodiversity studies within the fungi family. Phylogenetic analysis helps understand the evolutionary history of various groups of organisms, identify relationships between different species, and predict future evolutionary changes. Emerging imagery systems and new analysis techniques allow for the discovery of more genetic relationships in biodiverse fields, which can aid in conservation efforts by identifying rare species that could benefit ecosystems globally.


Infectious disease epidemiology

Whole-genome sequence data from outbreaks or epidemics of infectious diseases can provide important insights into transmission dynamics and inform public health strategies. Traditionally, studies have combined genomic and epidemiological data to reconstruct transmission events. However, recent research has explored deducing transmission patterns solely from genomic data using phylodynamics, which involves analyzing the properties of pathogen phylogenies. Phylodynamics uses theoretical models to compare predicted branch lengths with actual branch lengths in phylogenies to infer transmission patterns. Additionally, coalescent theory, which describes probability distributions on trees based on population size, has been adapted for epidemiological purposes. Another source of information within phylogenies that has been explored is "tree shape." These approaches, while computationally intensive, have the potential to provide valuable insights into pathogen transmission dynamics. The structure of the host contact network significantly impacts the dynamics of outbreaks, and management strategies rely on understanding these transmission patterns. Pathogen genomes spreading through different contact network structures, such as chains, homogeneous networks, or networks with super-spreaders, accumulate mutations in distinct patterns, resulting in noticeable differences in the shape of phylogenetic trees, as illustrated in Fig. 1. Researchers have analyzed the structural characteristics of phylogenetic trees generated from simulated bacterial genome evolution across multiple types of contact networks. By examining simple topological properties of these trees, researchers can classify them into chain-like, homogeneous, or super-spreading dynamics, revealing transmission patterns. These properties form the basis of a computational classifier used to analyze real-world outbreaks. Computational predictions of transmission dynamics for each outbreak often align with known epidemiological data. Different transmission networks result in quantitatively different tree shapes. To determine whether tree shapes captured information about underlying disease transmission patterns, researchers simulated the evolution of a bacterial genome over three types of outbreak contact networks—homogeneous, super-spreading, and chain-like. They summarized the resulting phylogenies with five metrics describing tree shape. Figures 2 and 3 illustrate the distributions of these metrics across the three types of outbreaks, revealing clear differences in tree topology depending on the underlying host contact network. Super-spreader networks give rise to phylogenies with higher Colless imbalance, longer ladder patterns, lower Δw, and deeper trees than those from homogeneous contact networks. Trees from chain-like networks are less variable, deeper, more imbalanced, and narrower than those from other networks. Scatter plots can be used to visualize the relationship between two variables in pathogen transmission analysis, such as the number of infected individuals and the time since infection. These plots can help identify trends and patterns, such as whether the spread of the pathogen is increasing or decreasing over time, and can highlight potential transmission routes or super-spreader events.
Box plot In descriptive statistics, a box plot or boxplot is a method for demonstrating graphically the locality, spread and skewness groups of numerical data through their quartiles. In addition to the box on a box plot, there can be lines (which are ca ...
s displaying the range, median, quartiles, and potential outliers datasets can also be valuable for analyzing pathogen transmission data, helping to identify important features in the data distribution. They may be used to quickly identify differences or similarities in the transmission data.


Disciplines other than biology

Phylogenetic tools and representations (trees and networks) can also be applied to
philology Philology () is the study of language in Oral tradition, oral and writing, written historical sources. It is the intersection of textual criticism, literary criticism, history, and linguistics with strong ties to etymology. Philology is also de ...
, the study of the evolution of oral languages and written text and manuscripts, such as in the field of quantitative comparative linguistics. Computational phylogenetics can be used to investigate a language as an evolutionary system. The evolution of human language closely corresponds with human's biological evolution which allows phylogenetic methods to be applied. The concept of a "tree" serves as an efficient way to represent relationships between languages and language splits. It also serves as a way of testing hypotheses about the connections and ages of language families. For example, relationships among languages can be shown by using
cognate In historical linguistics, cognates or lexical cognates are sets of words that have been inherited in direct descent from an etymological ancestor in a common parent language. Because language change can have radical effects on both the s ...
s as characters. The phylogenetic tree of Indo-European languages shows the relationships between several of the languages in a timeline, as well as the similarity between words and word order. There are three types of criticisms about using phylogenetics in philology, the first arguing that languages and species are different entities, therefore you can not use the same methods to study both. The second being how phylogenetic methods are being applied to linguistic data. And the third, discusses the types of data that is being used to construct the trees. Bayesian phylogenetic methods, which are sensitive to how treelike the data is, allow for the reconstruction of relationships among languages, locally and globally. The main two reasons for the use of Bayesian phylogenetics are that (1) diverse scenarios can be included in calculations and (2) the output is a sample of trees and not a single tree with true claim. The same process can be applied to texts and manuscripts. In Paleography, the study of historical writings and manuscripts, texts were replicated by scribes who copied from their source and alterations - i.e., 'mutations' - occurred when the scribe did not precisely copy the source. Phylogenetics has been applied to archaeological artefacts such as the early hominin hand-axes, late Palaeolithic figurines, Neolithic stone arrowheads, Bronze Age ceramics, and historical-period houses. Bayesian methods have also been employed by archaeologists in an attempt to quantify uncertainty in the tree topology and divergence times of stone projectile point shapes in the European Final Palaeolithic and earliest Mesolithic.


See also

* Angiosperm Phylogeny Group * Bauplan *
Bioinformatics Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
* Biomathematics * Coalescent theory * Cytonuclear discordance * EDGE of Existence programme * Evolutionary taxonomy *
Language family A language family is a group of languages related through descent from a common ancestor, called the proto-language of that family. The term ''family'' is a metaphor borrowed from biology, with the tree model used in historical linguistics ...
* Maximum parsimony * Microbial phylogenetics * Molecular evolution * Molecular phylogeny *
Ontogeny Ontogeny (also ontogenesis) is the origination and development of an organism (both physical and psychological, e.g., moral development), usually from the time of fertilization of the ovum, egg to adult. The term can also be used to refer to t ...
* PhyloCode * Phylodynamics * Phylogenesis * Phylogenetic comparative methods * Phylogenetic network *
Phylogenetic nomenclature Phylogenetic nomenclature is a method of nomenclature for taxa in biology that uses phylogenetic definitions for taxon names as explained below. This contrasts with the traditional method, by which taxon names are defined by a '' type'', which c ...
* Phylogenetic tree viewers * Phylogenetics software * Phylogenomics * Phylogeny (psychoanalysis) * Phylogeography *
Systematics Systematics is the study of the diversification of living forms, both past and present, and the relationships among living things through time. Relationships are visualized as evolutionary trees (synonyms: phylogenetic trees, phylogenies). Phy ...


References


Bibliography

* * * *


External links

* {{Authority control