Distance Matrices In Phylogeny
   HOME
*





Distance Matrices In Phylogeny
Distance matrices are used in phylogeny as non-parametric distance methods and were originally applied to phenetic data using a matrix of pairwise distances. These distances are then reconciled to produce a tree (a phylogram, with informative branch lengths). The distance matrix can come from a number of different sources, including measured distance (for example from immunological studies) or morphometric analysis, various pairwise distance formulae (such as euclidean distance) applied to discrete morphological characters, or genetic distance from sequence, restriction fragment, or allozyme data. For phylogenetic character data, raw distance values can be calculated by simply counting the number of pairwise differences in character states (Hamming distance). Distance-matrix methods Distance-matrix methods of phylogenetic analysis explicitly rely on a measure of "genetic distance" between the sequences being classified, and therefore they require an MSA (multiple sequence alig ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Non-parametric
Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions (common examples of parameters are the mean and variance). Nonparametric statistics is based on either being distribution-free or having a specified distribution but with the distribution's parameters unspecified. Nonparametric statistics includes both descriptive statistics and statistical inference. Nonparametric tests are often used when the assumptions of parametric tests are violated. Definitions The term "nonparametric statistics" has been imprecisely defined in the following two ways, among others: Applications and purpose Non-parametric methods are widely used for studying populations that take on a ranked order (such as movie reviews receiving one to four stars). The use of non-parametric methods may be necessary when data have a ranking but no clear numerical interpretation, such as when assessing preferences. In terms of levels of meas ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

WPGMA
WPGMA (Weighted Pair Group Method with Arithmetic Mean) is a simple agglomerative (bottom-up) hierarchical clustering method, generally attributed to Sokal and Michener. The WPGMA method is similar to its ''unweighted'' variant, the UPGMA method. Algorithm The WPGMA algorithm constructs a rooted tree (dendrogram) that reflects the structure present in a pairwise distance matrix (or a similarity matrix). At each step, the nearest two clusters, say i and j, are combined into a higher-level cluster i \cup j. Then, its distance to another cluster k is simply the arithmetic mean of the average distances between members of k and i and k and j : d_ = \frac The WPGMA algorithm produces rooted dendrograms and requires a constant-rate assumption: it produces an ultrametric tree in which the distances from the root to every branch tip are equal. This ultrametricity assumption is called the molecular clock when the tips involve DNA, RNA and protein data. Working example This work ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Conservation (genetics)
In evolutionary biology, conserved sequences are identical or similar Sequence (biology), sequences in nucleic acids (DNA sequence, DNA and RNA) or peptide sequence, proteins across species (homology (biology)#Orthology, orthologous sequences), or within a genome (homology (biology)#Paralogy, paralogous sequences), or between donor and receptor taxa (Sequence homology#Xenology, xenologous sequences). Conservation indicates that a sequence has been maintained by natural selection. A highly conserved sequence is one that has remained relatively unchanged far back up the phylogenetic tree, and hence far back in geological time. Examples of highly conserved sequences include the Ribosomal RNA, RNA components of ribosomes present in all domain (biology), domains of life, the homeobox sequences widespread amongst Eukaryotes, and the tmRNA in Bacteria. The study of sequence conservation overlaps with the fields of genomics, proteomics, evolutionary biology, phylogenetics, bioinformatics ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Signal Noise
In electronics, noise is an unwanted disturbance in an electrical signal. Noise generated by electronic devices varies greatly as it is produced by several different effects. In particular, noise is inherent in physics, and central to thermodynamics. Any conductor with electrical resistance will generate thermal noise inherently. The final elimination of thermal noise in electronics can only be achieved cryogenically, and even then quantum noise would remain inherent. Electronic noise is a common component of noise in signal processing. In communication systems, noise is an error or undesired random disturbance of a useful information signal in a communication channel. The noise is a summation of unwanted or disturbing energy from natural and sometimes man-made sources. Noise is, however, typically distinguished from interference, for example in the signal-to-noise ratio (SNR), signal-to-interference ratio (SIR) and signal-to-noise plus interference ratio (SNIR) measu ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Experimental Control
A scientific control is an experiment or observation designed to minimize the effects of variables other than the independent variable (i.e. confounding variables). This increases the reliability of the results, often through a comparison between control measurements and the other measurements. Scientific controls are a part of the scientific method. Controlled experiments Controls eliminate alternate explanations of experimental results, especially experimental errors and experimenter bias. Many controls are specific to the type of experiment being performed, as in the molecular markers used in SDS-PAGE experiments, and may simply have the purpose of ensuring that the equipment is working properly. The selection and use of proper controls to ensure that experimental results are valid (for example, absence of confounding variables) can be very difficult. Control measurements may also be used for other purposes: for example, a measurement of a microphone's background noise in ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Outgroup (cladistics)
In cladistics or phylogenetics, an outgroup is a more distantly related group of organisms that serves as a reference group when determining the evolutionary relationships of the ingroup, the set of organisms under study, and is distinct from sociological outgroups. The outgroup is used as a point of comparison for the ingroup and specifically allows for the phylogeny to be rooted. Because the polarity (direction) of character change can be determined only on a rooted phylogeny, the choice of outgroup is essential for understanding the evolution of traits along a phylogeny. History Although the concept of outgroups has been in use from the earliest days of cladistics, the term "outgroup" is thought to have been coined in the early 1970s at the American Museum of Natural History. Prior to the advent of the term, various other terms were used by evolutionary biologists, including "exgroup", "related group", and "outside groups". Choice of outgroup The chosen outgroup is hypothes ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Heuristic
A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate, short-term goal or approximation. Where finding an optimal solution is impossible or impractical, heuristic methods can be used to speed up the process of finding a satisfactory solution. Heuristics can be mental shortcuts that ease the cognitive load of making a decision. Examples that employ heuristics include using trial and error, a rule of thumb or an educated guess. Heuristics are the strategies derived from previous experiences with similar problems. These strategies depend on using readily accessible, though loosely applicable, information to control problem solving in human beings, machines and abstract issues. When an individual applies a heuristic in practice, it generally performs as expected. However it can alternatively cre ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

NP-complete
In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying all possible solutions. # the problem can be used to simulate every other problem for which we can verify quickly that a solution is correct. In this sense, NP-complete problems are the hardest of the problems to which solutions can be verified quickly. If we could find solutions of some NP-complete problem quickly, we could quickly find the solutions of every other problem to which a given solution can be easily verified. The name "NP-complete" is short for "nondeterministic polynomial-time complete". In this name, "nondeterministic" refers to nondeterministic Turing machines, a way of mathematically formalizing the idea of a brute-force search algorithm. Polynomial time refers to an amount of time that is considered "quick" for a de ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Jukes–Cantor Model
In biology, a substitution model, also called models of DNA sequence evolution, are Markov models that describe changes over evolutionary time. These models describe evolutionary changes in macromolecules (e.g., DNA sequences) represented as sequence of symbols (A, C, G, and T in the case of DNA). Substitution models are used to calculate the likelihood of phylogenetic trees using multiple sequence alignment data. Thus, substitution models are central to maximum likelihood estimation of phylogeny as well as Bayesian inference in phylogeny. Estimates of evolutionary distances (numbers of substitutions that have occurred since a pair of sequences diverged from a common ancestor) are typically calculated using substitution models (evolutionary distances are used input for distance methods such as neighbor joining). Substitution models are also central to phylogenetic invariants because they are necessary to predict site pattern frequencies given a tree topology. Substitution models a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Substitution Matrix
In bioinformatics and evolutionary biology, a substitution matrix describes the frequency at which a character in a nucleotide sequence or a protein sequence changes to other character states over evolutionary time. The information is often in the form of log odds of finding two specific character states aligned and depends on the assumed number of evolutionary changes or sequence dissimilarity between compared sequences. It is an application of a stochastic matrix. Substitution matrices are usually seen in the context of amino acid or DNA sequence alignments, where they are used to calculate similarity scores between the aligned sequences. Background In the process of evolution, from one generation to the next the amino acid sequences of an organism's proteins are gradually altered through the action of DNA mutations. For example, the sequence ALEIRYLRD could mutate into the sequence ALEINYLRD in one step, and possibly AQEINYQRD over a longer period of evolutionary t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Mutation
In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosis, or meiosis or other types of damage to DNA (such as pyrimidine dimers caused by exposure to ultraviolet radiation), which then may undergo error-prone repair (especially microhomology-mediated end joining), cause an error during other forms of repair, or cause an error during replication (translesion synthesis). Mutations may also result from insertion or deletion of segments of DNA due to mobile genetic elements. Mutations may or may not produce detectable changes in the observable characteristics (phenotype) of an organism. Mutations play a part in both normal and abnormal biological processes including: evolution, cancer, and the development of the immune system, including junctional diversity. Mutation is the ultimate source o ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Expected Value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a large number of independently selected outcomes of a random variable. The expected value of a random variable with a finite number of outcomes is a weighted average of all possible outcomes. In the case of a continuum of possible outcomes, the expectation is defined by integration. In the axiomatic foundation for probability provided by measure theory, the expectation is given by Lebesgue integration. The expected value of a random variable is often denoted by , , or , with also often stylized as or \mathbb. History The idea of the expected value originated in the middle of the 17th century from the study of the so-called problem of points, which seeks to divide the stakes ''in a fair way'' between two players, who have to end th ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]