Error Threshold (evolution)
   HOME

TheInfoList



OR:

In
evolutionary biology Evolutionary biology is the subfield of biology that studies the evolutionary processes (natural selection, common descent, speciation) that produced the diversity of life on Earth. It is also defined as the study of the history of life fo ...
and
population genetics Population genetics is a subfield of genetics that deals with genetic differences within and between populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as adaptation, speciation, and pop ...
, the error threshold (or critical mutation rate) is a limit on the number of
base pairs A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
a self-replicating molecule may have before mutation will destroy the information in subsequent generations of the molecule. The error threshold is crucial to understanding "Eigen's paradox". The error threshold is a concept in the origins of life (
abiogenesis In biology, abiogenesis (from a- 'not' + Greek bios 'life' + genesis 'origin') or the origin of life is the natural process by which life has arisen from non-living matter, such as simple organic compounds. The prevailing scientific hypothes ...
), in particular of very early life, before the advent of DNA. It is postulated that the first self-replicating molecules might have been small
ribozyme Ribozymes (ribonucleic acid enzymes) are RNA molecules that have the ability to catalyze specific biochemical reactions, including RNA splicing in gene expression, similar to the action of protein enzymes. The 1982 discovery of ribozymes demonst ...
-like
RNA Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydra ...
molecules. These molecules consist of strings of base pairs or "digits", and their order is a code that directs how the molecule interacts with its environment. All replication is subject to mutation error. During the replication process, each digit has a certain probability of being replaced by some other digit, which changes the way the molecule interacts with its environment, and may increase or decrease its fitness, or ability to reproduce, in that environment.


Fitness landscape

It was noted by
Manfred Eigen Manfred Eigen (; 9 May 1927 – 6 February 2019) was a German Biophysical chemistry, biophysical chemist who won the 1967 Nobel Prize in Chemistry for work on measuring fast chemical reactions. Eigen's research helped solve major problems in ...
in his 1971 paper (Eigen 1971) that this mutation process places a limit on the number of digits a molecule may have. If a molecule exceeds this critical size, the effect of the mutations becomes overwhelming and a runaway mutation process will destroy the information in subsequent generations of the molecule. The error threshold is also controlled by the "fitness landscape" for the molecules. The fitness landscape is characterized by the two concepts of height (=fitness) and distance (=number of mutations). Similar molecules are "close" to each other, and molecules that are fitter than others and more likely to reproduce, are "higher" in the landscape. If a particular sequence and its neighbors have a high fitness, they will form a
quasispecies The quasispecies model is a description of the process of the Darwinian evolution of certain self-replicating entities within the framework of physical chemistry. A quasispecies is a large group or "cloud" of related genotypes that exist in an en ...
and will be able to support longer sequence lengths than a fit sequence with few fit neighbors, or a less fit neighborhood of sequences. Also, it was noted by Wilke (Wilke 2005) that the error threshold concept does not apply in portions of the landscape where there are lethal mutations, in which the induced mutation yields zero fitness and prohibits the molecule from reproducing.


Eigen's paradox

Eigen's paradox is one of the most intractable puzzles in the study of the origins of life. It is thought that the error threshold concept described above limits the size of self replicating molecules to perhaps a few hundred digits, yet almost all life on earth requires much longer molecules to encode their genetic information. This problem is handled in living cells by enzymes that repair mutations, allowing the encoding molecules to reach sizes on the order of millions of base pairs. These large molecules must, of course, encode the very enzymes that repair them, and herein lies Eigen's paradox, first put forth by
Manfred Eigen Manfred Eigen (; 9 May 1927 – 6 February 2019) was a German Biophysical chemistry, biophysical chemist who won the 1967 Nobel Prize in Chemistry for work on measuring fast chemical reactions. Eigen's research helped solve major problems in ...
in his 1971 paper (Eigen 1971). Simply stated, Eigen's paradox amounts to the following: * Without error correction enzymes, the maximum size of a replicating molecule is about 100 base pairs. * For a replicating molecule to encode error correction enzymes, it must be substantially larger than 100 bases. This is a chicken-or-egg kind of a paradox, with an even more difficult solution. Which came first, the large genome or the error correction enzymes? A number of solutions to this paradox have been proposed: * Stochastic corrector model (Szathmáry & Maynard Smith, 1995). In this proposed solution, a number of primitive molecules of say, two different types, are associated with each other in some way, perhaps by a capsule or "cell wall". If their reproductive success is enhanced by having, say, equal numbers in each cell, and reproduction occurs by division in which each of various types of molecules are randomly distributed among the "children", the process of selection will promote such equal representation in the cells, even though one of the molecules may have a selective advantage over the other. * Relaxed error threshold (Kun et al., 2005) - Studies of actual ribozymes indicate that the mutation rate can be substantially less than first expected - on the order of 0.001 per base pair per replication. This may allow sequence lengths of the order of 7-8 thousand base pairs, sufficient to incorporate rudimentary error correction enzymes.


A simple mathematical model

Consider a 3-digit molecule ,B,Cwhere A, B, and C can take on the values 0 and 1. There are eight such sequences ( 00 01 10 11 00 01 10 and 11. Let's say that the 00molecule is the most fit; upon each replication it produces an average of a copies, where a>1. This molecule is called the "master sequence". The other seven sequences are less fit; they each produce only 1 copy per replication. The replication of each of the three digits is done with a mutation rate of μ. In other words, at every replication of a digit of a sequence, there is a probability \mu that it will be erroneous; 0 will be replaced by 1 or vice versa. Let's ignore double mutations and the death of molecules (the population will grow infinitely), and divide the eight molecules into three classes depending on their
Hamming distance In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to chan ...
from the master sequence: : Note that the number of sequences for distance ''d'' is just the
binomial coefficient In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers and is written \tbinom. It is the coefficient of the t ...
\tbinom L d for L=3, and that each sequence can be visualized as the vertex of an L=3 dimensional cube, with each edge of the cube specifying a mutation path in which the change Hamming distance is either zero or ±1. It can be seen that, for example, one third of the mutations of the 01molecules will produce 00molecules, while the other two thirds will produce the class 2 molecules 11and 01 We can now write the expression for the child populations n'_i of class ''i'' in terms of the parent populations n_j. :n'_i=\sum_^3 w_n_j where the matrix w''’ that incorporates natural selection and mutation, according to
quasispecies model The quasispecies model is a description of the process of the Darwinian evolution of certain self-replicating entities within the framework of physical chemistry. A quasispecies is a large group or "cloud" of related genotypes that exist in an env ...
, is given by: :\mathbf= \begin a\cdot Q&3a\cdot\mu&0&0\\ \mu&Q&2\mu&0\\ 0&2\mu&Q&\mu\\ 0&0&3\mu&Q \end where Q=(1-\mu)^L is the probability that an entire molecule will be replicated successfully. The
eigenvectors In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
of the w matrix will yield the equilibrium population numbers for each class. For example, if the mutation rate μ is zero, we will have Q=1, and the equilibrium concentrations will be _0,n_1,n_2,n_3 ,0,0,0/math>. The master sequence, being the fittest will be the only one to survive. If we have a replication fidelity of Q=0.95 and genetic advantage of a=1.05, then the equilibrium concentrations will be roughly .33,0.38,0.24,0.06/math>. It can be seen that the master sequence is not as dominant; nevertheless, sequences with low Hamming distance are in majority. If we have a replication fidelity of Q approaching 0, then the equilibrium concentrations will be roughly .125,0.375,0.375,0.125/math>. This is a population with equal number of each of 8 sequences. (If we had perfectly equal population of all sequences, we would have populations of ,3,3,18.) If we now go to the case where the number of base pairs is large, say L=100, we obtain behavior that resembles a
phase transition In chemistry, thermodynamics, and other related fields, a phase transition (or phase change) is the physical process of transition between one state of a medium and another. Commonly the term is used to refer to changes among the basic states of ...
. The plot below on the left shows a series of equilibrium concentrations divided by the binomial coefficient \tbinom k . (This multiplication will show the population for an individual sequence at that distance, and will yield a flat line for an equal distribution.) The selective advantage of the master sequence is set at a=1.05. The horizontal axis is the Hamming distance ''d'' . The various curves are for various total mutation rates (1-Q). It is seen that for low values of the total mutation rate, the population consists of a
quasispecies The quasispecies model is a description of the process of the Darwinian evolution of certain self-replicating entities within the framework of physical chemistry. A quasispecies is a large group or "cloud" of related genotypes that exist in an en ...
gathered in the neighborhood of the master sequence. Above a total mutation rate of about 1-Q=0.05, the distribution quickly spreads out to populate all sequences equally. The plot below on the right shows the fractional population of the master sequence as a function of the total mutation rate. Again it is seen that below a critical mutation rate of about 1-Q=0.05, the master sequence contains most of the population, while above this rate, it contains only about 2^\approx 10^ of the total population. It can be seen that there is a sharp transition at a value of ''1-Q''  just a bit larger than 0.05. For mutation rates above this value, the population of the master sequence drops to practically zero. Above this value, it dominates. In the limit as ''L'' approaches infinity, the system does in fact have a phase transition at a critical value of Q: Q_c=1/a.. One could think of the overall mutation rate (1-Q) as a sort of "temperature", which "melts" the fidelity of the molecular sequences above the critical "temperature" of 1-Q_c. For faithful replication to occur, the information must be "frozen" into the genome.


See also

*
Error catastrophe Error catastrophe refers to the cumulative loss of genetic information in a lineage of organisms due to high mutation rates. The mutation rate above which error catastrophe occurs is called the error threshold. Both terms were coined by Manfred ...
*
Extinction vortex Extinction vortices are a class of models through which conservation biologists, geneticists and ecologists can understand the dynamics of and categorize extinctions in the context of their causes. This model shows the events that ultimately lead ...
* Genetic entropy *
Genetic erosion Genetic erosion (also known as genetic depletion) is a process where the limited gene pool of an endangered species diminishes even more when reproductive individuals die off before reproducing with others in their endangered low population. The ...
*
Muller's ratchet In evolutionary genetics, Muller's ratchet (named after Hermann Joseph Muller, by analogy with a ratchet effect) is a process through which, in the absence of recombination (especially in an asexual population), an accumulation of irreversible d ...


References

* * * * * * * {{Population genetics Evolutionary biology Population genetics Microbial population biology