In
molecular biology
Molecular biology is the branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions. The study of chemical and phys ...
(specifically
protein biosynthesis
Protein biosynthesis (or protein synthesis) is a core biological process, occurring inside cells, balancing the loss of cellular proteins (via degradation or export) through the production of new proteins. Proteins perform a number of critical ...
), a stop codon (or termination codon) is a
codon
The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
(
nucleotide
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules wi ...
triplet within
messenger RNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.
mRNA is created during the p ...
) that signals the termination of the
translation
Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
process of the current
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
. Most codons in messenger RNA correspond to the addition of an
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
to a growing
polypeptide
Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides.
A p ...
chain, which may ultimately become a protein; stop codons signal the termination of this process by binding
release factor
A release factor is a protein that allows for the termination of translation by recognizing the termination codon or stop codon in an mRNA sequence. They are named so because they release new peptides from the ribosome.
Background
During t ...
s, which cause the
ribosomal
Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to fo ...
subunits to disassociate, releasing the amino acid chain.
While
start codon
The start codon is the first codon of a messenger RNA (mRNA) transcript translated by a ribosome. The start codon always codes for methionine in eukaryotes and Archaea and a N-formylmethionine (fMet) in bacteria, mitochondria and plastids. The ...
s need nearby sequences or
initiation factor Initiation factors are proteins that bind to the small subunit of the ribosome during the initiation of translation, a part of protein biosynthesis.
Initiation factors can interact with repressors to slow down or prevent translation. They have the ...
s to start translation, a stop codon alone is sufficient to initiate termination.
Properties
Standard codons
In the standard genetic code, there are three different termination codons:
Alternative stop codons
There are
variations on the standard genetic code, and alternative stop codons have been found in the
mitochondrial genome
Mitochondrial DNA (mtDNA or mDNA) is the DNA located in mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use, such as adenosine triphosphate (ATP). Mitochondrial D ...
s of
vertebrate
Vertebrates () comprise all animal taxa within the subphylum Vertebrata () ( chordates with backbones), including all mammals, birds, reptiles, amphibians, and fish. Vertebrates represent the overwhelming majority of the phylum Chordata, ...
s, ''
Scenedesmus obliquus
''Scenedesmus obliquus'' is a green algae species of the genus ''Scenedesmus''.
This chlorophyte species is notable for the genetic coding of its mitochondria which translate TCA as a stop codon and TAG as leucine. This code is represented by N ...
'',
and ''
Thraustochytrium''.
Reassigned stop codons
The nuclear genetic code is flexible as illustrated by variant genetic codes that reassign standard stop codons to amino acids.
Translation
In 1986, convincing evidence was provided that
selenocysteine
Selenocysteine (symbol Sec or U, in older publications also as Se-Cys) is the 21st proteinogenic amino acid. Selenoproteins contain selenocysteine residues. Selenocysteine is an analogue of the more common cysteine with selenium in place of the s ...
(Sec) was incorporated co-translationally. Moreover, the codon partially directing its incorporation in the polypeptide chain was identified as UGA also known as the opal termination codon. Different mechanisms for overriding the termination function of this codon have been identified in prokaryotes and in eukaryotes. A particular difference between these kingdoms is that cis elements seem restricted to the neighborhood of the UAG codon in prokaryotes while in eukaryotes this restriction is not present. Instead such locations seem disfavored albeit not prohibited.
In 2003, a landmark paper described the identification of all known selenoproteins in humans: 25 in total. Similar analyses have been run for other organisms.
The UAG codon can translate into
pyrrolysine
Pyrrolysine (symbol Pyl or O; encoded by the 'amber' stop codon UAG) is an α-amino acid that is used in the biosynthesis of proteins in some methanogenic archaea and bacteria; it is not present in humans. It contains an α-amino group (which is ...
(Pyl) in a similar manner.
Genomic distribution
Distribution of stop codons within the genome of an organism is non-random and can correlate with
GC-content
In molecular biology and genetics, GC-content (or guanine-cytosine content) is the percentage of nitrogenous bases in a DNA or RNA molecule that are either guanine (G) or cytosine (C). This measure indicates the proportion of G and C bases out o ...
.
For example, the ''E. coli'' K-12 genome contains 2705 TAA (63%), 1257 TGA (29%), and 326 TAG (8%) stop codons (GC content 50.8%). Also the substrates for the stop codons release factor 1 or release factor 2 are strongly correlated to the abundance of stop codons.
Large scale study of bacteria with a broad range of GC-contents shows that while the frequency of occurrence of TAA is negatively correlated to the GC-content and the frequency of occurrence of TGA is positively correlated to the GC-content, the frequency of occurrence of the TAG stop codon, which is often the minimally used stop codon in a genome, is not influenced by the GC-content.
Recognition
Recognition of stop codons in bacteria have been associated with the so-called 'tripeptide anticodon', a highly conserved amino acid motif in RF1 (PxT) and RF2 (SPF). Even though this is supported by structural studies, it was shown that the tripeptide anticodon hypothesis is an oversimplification.
Nomenclature
Stop codons were historically given many different names, as they each corresponded to a distinct class of mutants that all behaved in a similar manner. These mutants were first isolated within
bacteriophage
A bacteriophage (), also known informally as a ''phage'' (), is a duplodnaviria virus that infects and replicates within bacteria and archaea. The term was derived from "bacteria" and the Greek φαγεῖν ('), meaning "to devour". Bacteri ...
s (
T4 and
lambda
Lambda (}, ''lám(b)da'') is the 11th letter of the Greek alphabet, representing the voiced alveolar lateral approximant . In the system of Greek numerals, lambda has a value of 30. Lambda is derived from the Phoenician Lamed . Lambda gave rise ...
),
virus
A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea.
Since Dmitri Ivanovsky's 1 ...
es that infect the bacteria ''
Escherichia coli
''Escherichia coli'' (),Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. also known as ''E. coli'' (), is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus ''Escher ...
''. Mutations in viral genes weakened their infectious ability, sometimes creating viruses that were able to infect and grow within only certain varieties of ''E. coli''.
''amber'' mutations (UAG
)
They were the first set of
nonsense mutation
In genetics, a nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon, or a ''nonsense codon'' in the transcribed mRNA, and in leading to a truncated, incomplete, and usually nonfunctional protein produc ...
s to be discovered, isolated by Richard H. Epstein and Charles Steinberg and named after their friend and graduate Caltech student Harris Bernstein, whose last name means "
amber
Amber is fossilized tree resin that has been appreciated for its color and natural beauty since Neolithic times. Much valued from antiquity to the present as a gemstone, amber is made into a variety of decorative objects."Amber" (2004). In Ma ...
" in German (''cf.''
Bernstein
Bernstein is a common surname in the German language, meaning "amber" (literally "burn stone"). The name is used by both Germans and Jews, although it is most common among people of Ashkenazi Jewish heritage. The German pronunciation is , but in E ...
).
Viruses with amber mutations are characterized by their ability to infect only certain strains of bacteria, known as amber suppressors. These bacteria carry their own mutation that allows a recovery of function in the mutant viruses. For example, a mutation in the tRNA that recognizes the amber stop codon allows translation to "read through" the codon and produce a full-length protein, thereby recovering the normal form of the protein and "suppressing" the amber mutation.
Thus, amber mutants are an entire class of virus mutants that can grow in bacteria that contain amber suppressor mutations. Similar suppressors are known for ochre and opal stop codons as well.
tRNA molecules carrying unnatural aminoacids have been designed to recognize the amber stop codon in bacterial RNA. This technology allows for incorporation of orthogonal aminoacids (such as p-azidophenylalanine) at specific locations of the target protein.
''ochre'' mutations (UAA
)
It was the second stop codon mutation to be discovered. Reminiscent of the usual yellow-orange-brown color associated with amber, this second stop codon was given the name of "
ochre
Ochre ( ; , ), or ocher in American English, is a natural clay earth pigment, a mixture of ferric oxide and varying amounts of clay and sand. It ranges in colour from yellow to deep orange or brown. It is also the name of the colours produced ...
", an orange-reddish-brown mineral pigment.
Ochre mutant viruses had a property similar to amber mutants in that they recovered infectious ability within certain suppressor strains of bacteria. The set of ochre suppressors was distinct from amber suppressors, so ochre mutants were inferred to correspond to a different nucleotide triplet. Through a series of mutation experiments comparing these mutants with each other and other known amino acid codons,
Sydney Brenner
Sydney Brenner (13 January 1927 – 5 April 2019) was a South African biologist. In 2002, he shared the Nobel Prize in Physiology or Medicine with H. Robert Horvitz and Sir John E. Sulston. Brenner made significant contributions to work ...
concluded that the amber and ochre mutations corresponded to the nucleotide triplets "UAG" and "UAA".
''opal'' or ''umber'' mutations (UGA
)
The third and last stop codon in the standard genetic code was discovered soon after, and corresponds to the nucleotide triplet "UGA".
To continue matching with the theme of colored minerals, the third nonsense codon came to be known as "
opal
Opal is a hydrated amorphous form of silica (SiO2·''n''H2O); its water content may range from 3 to 21% by weight, but is usually between 6 and 10%. Due to its amorphous property, it is classified as a mineraloid, unlike crystalline forms ...
", which is a type of silica showing a variety of colors.
Nonsense mutations that created this premature stop codon were later called opal mutations or
umber
Umber is a natural brown earth pigment that contains iron oxide and manganese oxide. In its natural form, it is called raw umber. When Calcination, calcined, the color becomes warmer and it becomes known as burnt umber.
Its name derives from '' ...
mutations.
Mutations and disease
Nonsense
Nonsense mutations
In genetics, a nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon, or a ''nonsense codon'' in the transcribed mRNA, and in leading to a truncated, incomplete, and usually nonfunctional protein produc ...
are changes in DNA sequence that introduce a premature stop codon, causing any resulting protein to be abnormally shortened. This often causes a loss of function in the protein, as critical parts of the amino acid chain are no longer assembled. Because of this terminology, stop codons have also been referred to as nonsense codons.
Nonstop
A nonstop mutation, also called a stop-loss variant, is a
point mutation
A point mutation is a genetic mutation where a single nucleotide base is changed, inserted or deleted from a DNA or RNA sequence of an organism's genome. Point mutations have a variety of effects on the downstream protein product—consequences ...
that occurs within a stop codon. Nonstop mutations cause the continued translation of an
mRNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein.
mRNA is ...
strand into what should be an untranslated region. Most
polypeptides
Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides.
A p ...
resulting from a gene with a nonstop mutation lose their function their extreme length and the impact on normal folding. Nonstop mutations differ from
nonsense mutations
In genetics, a nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon, or a ''nonsense codon'' in the transcribed mRNA, and in leading to a truncated, incomplete, and usually nonfunctional protein produc ...
in that they do not create a stop codon but, instead, delete one. Nonstop mutations also differ from
missense mutation
In genetics, a missense mutation is a point mutation in which a single nucleotide change results in a codon that codes for a different amino acid. It is a type of nonsynonymous substitution.
Substitution of protein from DNA mutations
Missense m ...
s, which are point mutations where a single nucleotide is changed to cause replacement by a different
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
. Nonstop mutations have been linked with many inherited diseases including
endocrine
The endocrine system is a messenger system comprising feedback loops of the hormones released by internal glands of an organism directly into the circulatory system, regulating distant target organs. In vertebrates, the hypothalamus is the neu ...
disorders, eye disease, and
neurodevelopmental disorder
Neurodevelopmental disorders are a group of disorders that affect the development of the nervous system, leading to abnormal brain function which may affect emotion, learning ability, self-control, and memory. The effects of neurodevelopmental ...
s.
Hidden stops
Hidden stops are non-stop codons that would be read as stop codons if they were
frameshift
Ribosomal frameshifting, also known as translational frameshifting or translational recoding, is a biological phenomenon that occurs during translation that results in the production of multiple, unique proteins from a single mRNA. The process can ...
ed +1 or −1. These prematurely terminate translation if the corresponding frame-shift (such as due to a ribosomal RNA slip) occurs before the hidden stop. It is hypothesised that this decreases resource wastage on nonfunctional proteins and the production of potential
cytotoxins
Cytotoxicity is the quality of being toxic to cells. Examples of toxic agents are an immune cell or some types of venom, e.g. from the puff adder (''Bitis arietans'') or brown recluse spider (''Loxosceles reclusa'').
Cell physiology
Treating cell ...
. Researchers at
Louisiana State University
Louisiana State University (officially Louisiana State University and Agricultural and Mechanical College, commonly referred to as LSU) is a public land-grant research university in Baton Rouge, Louisiana. The university was founded in 1860 nea ...
propose the ''
ambush hypothesis The ambush hypothesis is a hypothesis in the field of molecular genetics that suggests that the prevalence of “hidden” or Frameshift mutation, off-frame stop codons in DNA selectively deters off-frame Translation (biology), translation of Messen ...
'', that hidden stops are selected for. Codons that can form hidden stops are used in genomes more frequently compared to synonymous codons that would otherwise code for the same amino acid. Unstable
rRNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosoma ...
in an organism correlates with a higher frequency of hidden stops.
However, this hypothesis could not be validated with a larger data set.
Stop-codons and hidden stops together are collectively referred as stop-signals. Researchers at
University of Memphis
}
The University of Memphis (UofM) is a public university, public research university in Memphis, Tennessee. Founded in 1912, the university has an enrollment of more than 22,000 students.
The university maintains the Herff College of Engineering ...
found that the ratios of the stop-signals on the three reading frames of a genome (referred to as translation stop-signals ratio or TSSR) of genetically related bacteria, despite their great differences in gene contents, are much alike. This nearly identical genomic-TSSR value of genetically related bacteria may suggest that bacterial genome expansion is limited by their unique stop-signals bias of that bacterial species.
Translational readthrough
Stop codon suppression or translational readthrough occurs when in translation a stop codon is interpreted as a sense codon, that is, when a (standard) amino acid is 'encoded' by the stop codon. Mutated
tRNA
Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ac ...
s can be the cause of readthrough, but also certain
nucleotide
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules wi ...
motifs close to the stop codon. Translational readthrough is very common in viruses and bacteria, and has also been found as a gene regulatory principle in humans, yeasts, bacteria and drosophila.
This kind of endogenous translational readthrough constitutes a variation of the
genetic code
The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
, because a stop codon codes for an amino acid. In the case of human
malate dehydrogenase
Malate dehydrogenase () (MDH) is an enzyme that reversibly catalyzes the oxidation of malate to oxaloacetate using the reduction of NAD+ to NADH. This reaction is part of many metabolic pathways, including the citric acid cycle. Other malate ...
, the stop codon is read through with a frequency of about 4%.
The amino acid inserted at the stop codon depends on the identity of the stop codon itself: Gln, Tyr, and Lys have been found for the UAA and UAG codons, while Cys, Trp, and Arg for the UGA codon have been identified by mass spectrometry.
Extent of readthrough in mammals have widely variable extents, and can broadly diversify the proteome and affect cancer progression.
Use as a watermark
In 2010 when
Craig Venter
John Craig Venter (born October 14, 1946) is an American biotechnologist and businessman. He is known for leading one of the first draft sequences of the human genome and assembled the first team to transfect a cell with a synthetic chromosome. ...
unveiled the first fully functioning, reproducing cell controlled by
synthetic DNA he described how his team used frequent stop codons to create
watermarks
A watermark is an identifying image or pattern in paper that appears as various shades of lightness/darkness when viewed by transmitted light (or when viewed by reflected light, atop a dark background), caused by thickness or density variations ...
in RNA and DNA to help confirm the results were indeed synthetic (and not contaminated or otherwise), using it to encode authors' names and website addresses.
See also
*
Genetic code
The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
*
Start codon
The start codon is the first codon of a messenger RNA (mRNA) transcript translated by a ribosome. The start codon always codes for methionine in eukaryotes and Archaea and a N-formylmethionine (fMet) in bacteria, mitochondria and plastids. The ...
*
Terminator gene
Genetic use restriction technology (GURT), also known as terminator technology or suicide seeds, is the name given to proposed methods for restricting the use of genetically modified crops by activating (or deactivating) some genes only in respon ...
References
{{reflist, 30em
Molecular genetics
Gene expression
Protein biosynthesis