The molecular clock is a figurative term for a technique that uses the

mutation rate In genetics, the mutation rate is the frequency of new mutations in a single gene or organism over time. Mutation rates are not constant and are not limited to a single type of mutation; there are many different types of mutations. Mutation rates ...

of biomolecules to deduce the time in

prehistory Prehistory, also known as pre-literary history, is the period of human history between the use of the first stone tools by hominins 3.3 million years ago and the beginning of recorded history with the invention of writing systems. The use of ...

when two or more

life form Life form (also spelled life-form or lifeform) is an wikt:entity, entity that is Life, living, such as plants (flora) and animals (fauna). It is estimated that more than 99% of all species that ever existed on Earth, amounting to over five billi ...

s diverged. The biomolecular data used for such calculations are usually

nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules wi ...

sequences In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called t ...

for DNA,

RNA Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydra ...

, or

amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...

sequences for

protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...

s. The benchmarks for determining the mutation rate are often fossil or archaeological dates. The molecular clock was first tested in 1962 on the hemoglobin protein variants of various animals, and is commonly used in

molecular evolution Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and population genetics ...

to estimate times of

speciation Speciation is the evolutionary process by which populations evolve to become distinct species. The biologist Orator F. Cook coined the term in 1906 for cladogenesis, the splitting of lineages, as opposed to anagenesis, phyletic evolution within ...

radiation In physics, radiation is the emission or transmission of energy in the form of waves or particles through space or through a material medium. This includes: * ''electromagnetic radiation'', such as radio waves, microwaves, infrared, visi ...

. It is sometimes called a gene clock or an evolutionary clock.

Early discovery and genetic equidistance

The notion of the existence of a so-called "molecular clock" was first attributed to Émile Zuckerkandl and Linus Pauling who, in 1962, noticed that the number of

differences in

hemoglobin Hemoglobin (haemoglobin BrE) (from the Greek word αἷμα, ''haîma'' 'blood' + Latin ''globus'' 'ball, sphere' + ''-in'') (), abbreviated Hb or Hgb, is the iron-containing oxygen-transport metalloprotein present in red blood cells (erythrocyte ...

between different lineages changes roughly

linearly Linearity is the property of a mathematical relationship (''function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear r ...

with time, as estimated from fossil evidence. They generalized this observation to assert that the rate of

evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...

ary change of any specified

was approximately constant over time and over different lineages (known as the molecular clock hypothesis). The genetic equidistance phenomenon was first noted in 1963 by

Emanuel Margoliash Emanuel Margoliash (February 10, 1920 – April 10, 2008) was a biochemist who spent much of his career studying the protein cytochrome c. He is best known for his work on molecular evolution; with Walter Fitch, he devised Fitch-Margoliash met ...

, who wrote: "It appears that the number of residue differences between

cytochrome c The cytochrome complex, or cyt ''c'', is a small hemeprotein found loosely associated with the inner membrane of the mitochondrion. It belongs to the cytochrome c family of proteins and plays a major role in cell apoptosis. Cytochrome c is hig ...

of any two species is mostly conditioned by the time elapsed since the lines of evolution leading to these two species originally diverged. If this is correct, the cytochrome c of all mammals should be equally different from the cytochrome c of all birds. Since fish diverges from the main stem of vertebrate evolution earlier than either birds or mammals, the cytochrome c of both mammals and birds should be equally different from the cytochrome c of fish. Similarly, all vertebrate cytochrome c should be equally different from the yeast protein." For example, the difference between the cytochrome c of a carp and a frog, turtle, chicken, rabbit, and horse is a very constant 13% to 14%. Similarly, the difference between the cytochrome c of a bacterium and yeast, wheat, moth, tuna, pigeon, and horse ranges from 64% to 69%. Together with the work of Emile Zuckerkandl and Linus Pauling, the genetic equidistance result led directly to the formal postulation of the molecular clock hypothesis in the early 1960s. Similarly,

Vincent Sarich Vincent Matthew Sarich (December 13, 1934October 27, 2012) was an American anthropologist and biochemist. He was Professor Emeritus in anthropology at University of California, Berkeley. Sarich and his PhD advisor, Allan Wilson, used molecular dat ...

and Allan Wilson in 1967 demonstrated that molecular differences among modern

Primate Primates are a diverse order of mammals. They are divided into the strepsirrhines, which include the lemurs, galagos, and lorisids, and the haplorhines, which include the tarsiers and the simians (monkeys and apes, the latter including huma ...

s in

albumin Albumin is a family of globular proteins, the most common of which are the serum albumins. All the proteins of the albumin family are water-soluble, moderately soluble in concentrated salt solutions, and experience heat denaturation. Albumins ...

proteins showed that approximately constant rates of change had occurred in all the lineages they assessed. The basic logic of their analysis involved recognizing that if one species lineage had evolved more quickly than a sister species lineage since their common ancestor, then the molecular differences between an outgroup (more distantly related) species and the faster-evolving species should be larger (since more molecular changes would have accumulated on that lineage) than the molecular differences between the outgroup species and the slower-evolving species. This method is known as the

relative rate test The relative rate test is a genetic comparative test between two ingroups (somewhat closely related species) and an outgroup or “reference species” to compare mutation and evolutionary rates between the species. Each ingroup species is compared ...

. Sarich and Wilson's paper reported, for example, that human (''

Homo sapiens Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, culture, ...

'') and chimpanzee (''

Pan troglodytes The chimpanzee (''Pan troglodytes''), also known as simply the chimp, is a species of great ape native to the forest and savannah of tropical Africa. It has four confirmed subspecies and a fifth proposed subspecies. When its close relative th ...

'') albumin immunological cross-reactions suggested they were about equally different from Ceboidea (New World Monkey) species (within experimental error). This meant that they had both accumulated approximately equal changes in albumin since their shared common ancestor. This pattern was also found for all the primate comparisons they tested. When calibrated with the few well-documented fossil branch points (such as no Primate fossils of modern aspect found before the K-T boundary), this led Sarich and Wilson to argue that the human-chimp divergence probably occurred only ~4–6 million years ago.

Relationship with neutral theory

The observation of a clock-like rate of molecular change was originally purely phenomenological. Later, the work of

Motoo Kimura (November 13, 1924 – November 13, 1994) was a Japanese biologist best known for introducing the neutral theory of molecular evolution in 1968. He became one of the most influential theoretical population geneticists. He is remembered in geneti ...

developed the

neutral theory of molecular evolution The neutral theory of molecular evolution holds that most evolutionary changes occur at the molecular level, and most of the variation within and between species are due to random genetic drift of mutant alleles that are selectively neutral. The ...

, which predicted a molecular clock. Let there be N individuals, and to keep this calculation simple, let the individuals be haploid (i.e. have one copy of each gene). Let the rate of neutral

mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mi ...

s (i.e. mutations with no effect on fitness) in a new individual be

\mu

. The probability that this new mutation will become

fixed Fixed may refer to: * ''Fixed'' (EP), EP by Nine Inch Nails * ''Fixed'', an upcoming 2D adult animated film directed by Genndy Tartakovsky * Fixed (typeface), a collection of monospace bitmap fonts that is distributed with the X Window System * ...

in the population is then 1/N, since each copy of the gene is as good as any other. Every generation, each individual can have new mutations, so there are

\mu

N new neutral mutations in the population as a whole. That means that each generation,

\mu

new neutral mutations will become fixed. If most changes seen during

are neutral, then fixations in a population will accumulate at a clock-rate that is equal to the rate of neutral

s in an individual.

Calibration

To use molecular clocks to estimate divergence times, molecular clocks need to be "calibrated". This is because molecular data alone does not contain any information on absolute times. For viral phylogenetics and

ancient DNA Ancient DNA (aDNA) is DNA isolated from ancient specimens. Due to degradation processes (including cross-linking, deamination and fragmentation) ancient DNA is more degraded in comparison with contemporary genetic material. Even under the bes ...

studies—two areas of evolutionary biology where it is possible to sample sequences over an evolutionary timescale—the dates of the intermediate samples can be used to calibrate the molecular clock. However, most phylogenies require that the molecular clock be

calibrated In measurement technology and metrology, calibration is the comparison of measurement values delivered by a device under test with those of a calibration standard of known accuracy. Such a standard could be another measurement device of known ...

using independent evidence about dates, such as the

fossil A fossil (from Classical Latin , ) is any preserved remains, impression, or trace of any once-living thing from a past geological age. Examples include bones, shells, exoskeletons, stone imprints of animals or microbes, objects preserved ...

record. There are two general methods for calibrating the molecular clock using fossils: node calibration and tip calibration.

Node calibration

Sometimes referred to as node dating, node calibration is a method for time-scaling

phylogenetic tree A phylogenetic tree (also phylogeny or evolutionary tree Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, MA.) is a branching diagram or a tree showing the evolutionary relationships among various biological spec ...

s by specifying time constraints for one or more nodes in the tree. Early methods of clock calibration only used a single fossil constraint (e.g. non-parametric rate smoothing), but newer methods (BEAST and r8s) allow for the use of multiple fossils to calibrate molecular clocks. The oldest fossil of a

clade A clade (), also known as a monophyletic group or natural group, is a group of organisms that are monophyletic – that is, composed of a common ancestor and all its lineal descendants – on a phylogenetic tree. Rather than the English term, ...

is used to constrain the minimum possible age for the node representing the most recent common ancestor of the clade. However, due to incomplete fossil preservation and other factors, clades are typically older than their oldest fossils. In order to account for this, nodes are allowed to be older than the minimum constraint in node calibration analyses. However, determining how much older the node is allowed to be is challenging. There are a number of strategies for deriving the maximum bound for the age of a clade including those based on birth-death models, fossil

stratigraphic Stratigraphy is a branch of geology concerned with the study of rock layers (strata) and layering (stratification). It is primarily used in the study of sedimentary and layered volcanic rocks. Stratigraphy has three related subfields: lithostra ...

distribution analyses, or

taphonomic Taphonomy is the study of how organisms decay and become fossilized or preserved in the paleontological record. The term ''taphonomy'' (from Greek , 'burial' and , 'law') was introduced to paleontology in 1940 by Soviet scientist Ivan Efremov t ...

controls. Alternatively, instead of a maximum and a minimum, a

probability density In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...

can be used to represent the uncertainty about the age of the clade. These calibration densities can take the shape of standard probability densities (e.g.

normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...

lognormal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a norma ...

exponential Exponential may refer to any of several mathematical topics related to exponentiation, including: *Exponential function, also: **Matrix exponential, the matrix analogue to the above *Exponential decay, decrease at a rate proportional to value *Expo ...

, gamma) that can be used to express the probability of the true age of divergence. Determining the shape and parameters of the probability distribution is not trivial, but there are methods that use not only the oldest fossil but a larger sample of the fossil record of clades to estimate calibration densities empirically. Studies have shown that increasing the number of fossil constraints increases the accuracy of divergence time estimation.

Tip calibration

Sometimes referred to as

tip dating Tip dating is a technique used in molecular dating that allows the inference of time-calibrated phylogenetic trees. Its defining feature is that it uses the ages of the samples to provide time information for the analysis, in contrast with traditio ...

, tip calibration is a method of molecular clock calibration in which fossils are treated as

taxa In biology, a taxon (back-formation from ''taxonomy''; plural taxa) is a group of one or more populations of an organism or organisms seen by taxonomists to form a unit. Although neither is required, a taxon is usually known by a particular nam ...

and placed on the tips of the tree. This is achieved by creating a matrix that includes a molecular dataset for the

extant taxa Neontology is a part of biology that, in contrast to paleontology, deals with living (or, more generally, ''recent'') organisms. It is the study of extant taxa (singular: extant taxon): taxa (such as species, genera and families) with members st ...

along with a morphological dataset for both the extinct and the extant taxa. Unlike node calibration, this method reconstructs the tree topology and places the fossils simultaneously. Molecular and morphological models work together simultaneously, allowing morphology to inform the placement of fossils. Tip calibration makes use of all relevant fossil taxa during clock calibration, rather than relying on only the oldest fossil of each clade. This method does not rely on the interpretation of negative evidence to infer maximum clade ages.

Expansion calibration

Demographic changes in populations can be detected as fluctuations in historical coalescent

effective population size The effective population size (''N'e'') is a number that, in some simplified scenarios, corresponds to the number of breeding individuals in the population. More generally, ''N'e'' is the number of individuals that an idealised population w ...

from a sample of extant genetic variation in the population using coalescent theory. Ancient population expansions that are well documented and dated in the geological record can be used to calibrate a rate of molecular evolution in a manner similar to node calibration. However, instead of calibrating from the known age of a node, expansion calibration uses a two-epoch model of constant population size followed by population growth, with the time of transition between epochs being the parameter of interest for calibration. Expansion calibration works at shorter, intraspecific timescales in comparison to node calibration, because expansions can only be detected after the

most recent common ancestor In biology and genetic genealogy, the most recent common ancestor (MRCA), also known as the last common ancestor (LCA) or concestor, of a set of organisms is the most recent individual from which all the organisms of the set are descended. The ...

of the species in question. Expansion dating has been used to show that molecular clock rates can be inflated at short timescales (< 1 MY) due to incomplete fixation of alleles, as discussed below

Total evidence dating

This approach to tip calibration goes a step further by simultaneously estimating fossil placement, topology, and the evolutionary timescale. In this method, the age of a fossil can inform its phylogenetic position in addition to morphology. By allowing all aspects of tree reconstruction to occur simultaneously, the risk of biased results is decreased. This approach has been improved upon by pairing it with different models. One current method of molecular clock calibration is total evidence dating paired with the fossilized birth-death (FBD) model and a model of morphological evolution. The FBD model is novel in that it allows for "sampled ancestors", which are fossil taxa that are the direct ancestor of a living taxon or

lineage Lineage may refer to: Science * Lineage (anthropology), a group that can demonstrate its common descent from an apical ancestor or a direct line of descent from an ancestor * Lineage (evolution), a temporal sequence of individuals, populati ...

. This allows fossils to be placed on a branch above an extant organism, rather than being confined to the tips.

Methods

Bayesian methods can provide more appropriate estimates of divergence times, especially if large datasets—such as those yielded by

phylogenomics Phylogenomics is the intersection of the fields of evolution and genomics. The term has been used in multiple ways to refer to analysis that involves genome data and evolutionary reconstructions. It is a group of techniques within the larger field ...

—are employed.

Non-constant rate of molecular clock

Sometimes only a single divergence date can be estimated from fossils, with all other dates inferred from that. Other sets of species have abundant fossils available, allowing the hypothesis of constant divergence rates to be tested. DNA sequences experiencing low levels of negative selection showed divergence rates of 0.7–0.8% per Myr in bacteria, mammals, invertebrates, and plants. In the same study, genomic regions experiencing very high negative or purifying selection (encoding rRNA) were considerably slower (1% per 50 Myr). In addition to such variation in rate with genomic position, since the early 1990s variation among taxa has proven fertile ground for research too, even over comparatively short periods of evolutionary time (for example

mockingbird Mockingbirds are a group of New World passerine birds from the family Mimidae. They are best known for the habit of some species mimicking the songs of other birds and the sounds of insects and amphibians, often loudly and in rapid succession. ...

s). Tube-nosed seabirds have molecular clocks that on average run at half speed of many other birds, possibly due to long

generation A generation refers to all of the people born and living at about the same time, regarded collectively. It can also be described as, "the average period, generally considered to be about 20–⁠30 years, during which children are born and gr ...

times, and many turtles have a molecular clock running at one-eighth the speed it does in small mammals, or even slower. Effects of

small population size Small populations can behave differently from larger populations. They are often the result of population bottlenecks from larger populations, leading to loss of heterozygosity and reduced genetic diversity and loss or fixation of alleles and s ...

are also likely to confound molecular clock analyses. Researchers such as

Francisco J. Ayala Francisco José Ayala Pereda (born March 12, 1934) is a Spanish-American evolutionary biologist, philosopher, and former Catholic priest who was a longtime faculty member at the University of California, Irvine and University of California, Dav ...

have more fundamentally challenged the molecular clock hypothesis. * According to Ayala's 1999 study, five factors combine to limit the application of molecular clock models: * Changing generation times (If the rate of new mutations depends at least partly on the number of generations rather than the number of years) * Population size (

Genetic drift Genetic drift, also known as allelic drift or the Wright effect, is the change in the frequency of an existing gene variant (allele) in a population due to random chance. Genetic drift may cause gene variants to disappear completely and there ...

is stronger in small populations, and so more mutations are effectively neutral) * Species-specific differences (due to differing metabolism, ecology, evolutionary history, ...) * Change in function of the protein studied (can be avoided in closely related species by utilizing

non-coding DNA Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules (e.g. transfer RNA, microRNA, piRNA, ribosomal RNA, and regula ...

sequences or emphasizing

silent mutation Silent mutations are mutations in DNA that do not have an observable effect on the organism's phenotype. They are a specific type of neutral mutation. The phrase ''silent mutation'' is often used interchangeably with the phrase '' synonymous muta ...

s) * Changes in the intensity of natural selection. Molecular evolution bamboos

Molecular clock users have developed workaround solutions using a number of statistical approaches including

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stat ...

techniques and later Bayesian modeling. In particular, models that take into account rate variation across lineages have been proposed in order to obtain better estimates of divergence times. These models are called relaxed molecular clocks because they represent an intermediate position between the 'strict' molecular clock hypothesis and

Joseph Felsenstein Joseph "Joe" Felsenstein (born May 9, 1942) is a Professor Emeritus in the Departments of Genome Sciences and Biology at the University of Washington in Seattle. He is best known for his work on phylogenetic inference, and is the author of ''Inferr ...

's many-rates model and are made possible through MCMC techniques that explore a weighted range of tree topologies and simultaneously estimate parameters of the chosen substitution model. It must be remembered that divergence dates inferred using a molecular clock are based on statistical

inference Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that in ...

and not on direct

evidence Evidence for a proposition is what supports this proposition. It is usually understood as an indication that the supported proposition is true. What role evidence plays and how it is conceived varies from field to field. In epistemology, evidenc ...

. The molecular clock runs into particular challenges at very short and very long timescales. At long timescales, the problem is

saturation Saturation, saturated, unsaturation or unsaturated may refer to: Chemistry * Saturation, a property of organic compounds referring to carbon-carbon bonds **Saturated and unsaturated compounds ** Degree of unsaturation **Saturated fat or fatty aci ...

. When enough time has passed, many sites have undergone more than one change, but it is impossible to detect more than one. This means that the observed number of changes is no longer

linear Linearity is the property of a mathematical relationship ('' function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear ...

with time, but instead flattens out. Even at intermediate genetic distances, with phylogenetic data still sufficient to estimate topology, signal for the overall scale of the tree can be weak under complex likelihood models, leading to highly uncertain molecular clock estimates. At very short time scales, many differences between samples do not represent fixation of different sequences in the different populations. Instead, they represent alternative

alleles An allele (, ; ; modern formation from Greek ἄλλος ''állos'', "other") is a variation of the same sequence of nucleotides at the same place on a long DNA molecule, as described in leading textbooks on genetics and evolution. ::"The chrom ...

that were both present as part of a polymorphism in the common ancestor. The inclusion of differences that have not yet become

leads to a potentially dramatic inflation of the apparent rate of the molecular clock at very short timescales.

Uses

The molecular clock technique is an important tool in

molecular systematics Molecular phylogenetics () is the branch of phylogeny that analyzes genetic, hereditary molecular differences, predominantly in DNA sequences, to gain information on an organism's evolutionary relationships. From these analyses, it is possible to ...

macroevolution Macroevolution usually means the evolution of large-scale structures and traits that go significantly beyond the intraspecific variation found in microevolution (including speciation). In other words, macroevolution is the evolution of taxa abov ...

, and

phylogenetic comparative methods Phylogenetic comparative methods (PCMs) use information on the historical relationships of lineages ( phylogenies) to test evolutionary hypotheses. The comparative method has a long history in evolutionary biology; indeed, Charles Darwin used diff ...

. Estimation of the dates of

phylogenetic In biology, phylogenetics (; from Greek φυλή/ φῦλον [] "tribe, clan, race", and wikt:γενετικός, γενετικός [] "origin, source, birth") is the study of the evolutionary history and relationships among or within groups o ...

events, including those not documented by fossils, such as the divergences between living

has allowed the study of macroevolutionary processes in organisms that had limited fossil records. Phylogenetic comparative methods rely heavily on calibrated phylogenies. In applications over deep time scales, the limitations of the molecular clock hypothesis (above) must be considered; such estimates may be off by 50% or more.

References

External links

Allan Wilson and the molecular clock

* ttps://web.archive.org/web/20061107013958/http://www.fossilrecord.net/dateaclade/index.html Date-a-Clade service for the molecular tree of life {{DEFAULTSORT:Molecular Clock Evolutionary biology concepts Molecular evolution Molecular genetics Phylogenetics