Degeneracy or redundancy of
codon
The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
s is the redundancy of the
genetic code
The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
, exhibited as the multiplicity of three-base pair codon combinations that specify an amino acid. The degeneracy of the genetic code is what accounts for the existence of
synonymous mutations.
Background
Degeneracy of the genetic code was identified by Lagerkvist. For instance, codons GAA and GAG both specify glutamic acid and exhibit redundancy; but, neither specifies any other amino acid and thus are not ambiguous or demonstrate no ambiguity.
The codons encoding one amino acid may differ in any of their three positions; however, more often than not, this difference is in the second or third position. For instance, the amino acid
glutamic acid
Glutamic acid (symbol Glu or E; the ionic form is known as glutamate) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a non-essential nutrient for humans, meaning that the human body can synt ...
is specified by GAA and GAG codons (difference in the third position); the amino acid
leucine
Leucine (symbol Leu or L) is an essential amino acid that is used in the biosynthesis of proteins. Leucine is an α-amino acid, meaning it contains an α-amino group (which is in the protonated −NH3+ form under biological conditions), an α- ca ...
is specified by UUA, UUG, CUU, CUC, CUA, CUG codons (difference in the first or third position); and the amino acid
serine
Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − form un ...
is specified by UCA, UCG, UCC, UCU, AGU, AGC (difference in the first, second, or third position).
Degeneracy results because there are more codons than encodable amino acids. For example, if there were two bases per codon, then only 16 amino acids could be coded for (4²=16). Because at least 21 codes are required (20 amino acids plus stop) and the next largest number of bases is three, then 4³ gives 64 possible codons, meaning that some degeneracy must exist.
Implications
These properties of the genetic code make it more fault-tolerant for
point mutation
A point mutation is a genetic mutation where a single nucleotide base is changed, inserted or deleted from a DNA or RNA sequence of an organism's genome. Point mutations have a variety of effects on the downstream protein product—consequences ...
s. For example, in theory, fourfold degenerate codons can tolerate any point mutation at the third position, although
codon usage bias
Codon usage bias refers to differences in the frequency of occurrence of synonymous codons in coding DNA. A codon is a series of three nucleotides (a triplet) that encodes a specific amino acid residue in a polypeptide chain or for the terminatio ...
restricts this in practice in many organisms; twofold degenerate codons can withstand silence mutation rather than Missense or Nonsense point mutations at the third position. Since
transition mutations (purine to purine or pyrimidine to pyrimidine mutations) are more likely than
transversion
Transversion, in molecular biology, refers to a point mutation in DNA in which a single (two ring) purine ( A or G) is changed for a (one ring) pyrimidine ( T or C), or vice versa. A transversion can be spontaneous, or it can be caused by ioni ...
(purine to pyrimidine or vice versa) mutations, the equivalence of purines or that of pyrimidines at twofold degenerate sites adds a further fault-tolerance.
A practical consequence of redundancy is that some errors in the genetic code cause only a silent mutation or an error that would not affect the protein because the
hydrophilic
A hydrophile is a molecule or other molecular entity that is attracted to water molecules and tends to be dissolved by water.Liddell, H.G. & Scott, R. (1940). ''A Greek-English Lexicon'' Oxford: Clarendon Press.
In contrast, hydrophobes are no ...
ity or
hydrophobic
In chemistry, hydrophobicity is the physical property of a molecule that is seemingly repelled from a mass of water (known as a hydrophobe). In contrast, hydrophiles are attracted to water.
Hydrophobic molecules tend to be nonpolar and, th ...
ity is maintained by equivalent substitution of amino acids; for example, a codon of NUN (where N = any nucleotide) tends to code for hydrophobic amino acids. NCN yields amino acid residues that are small in size and moderate in hydropathy; NAN encodes average size hydrophilic residues.
These tendencies may result from the shared ancestry of the
aminoacyl tRNA synthetases related to these codons.
These variable codes for amino acids are allowed because of modified bases in the first base of the
anticodon
Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ac ...
of the tRNA, and the base-pair formed is called a
wobble base pair
A wobble base pair is a pairing between two nucleotides in RNA molecules that does not follow Watson-Crick base pair rules. The four main wobble base pairs are guanine-uracil (G-U), hypoxanthine-uracil (I-U), hypoxanthine-adenine (I-A), and hypox ...
. The modified bases include
inosine
Inosine is a nucleoside that is formed when hypoxanthine is attached to a ribose ring (also known as a ribofuranose) via a β-N9-glycosidic bond. It was discovered in 1965 in analysis of RNA transferase.
Inosine is commonly found in tRNAs and is e ...
and the Non-Watson-Crick U-G basepair.
Terminology
A position of a codon is said to be a ''n''-fold degenerate site if only ''n'' of four possible nucleotides (A, C, G, T) at this position specify the same amino acid. A nucleotide substitution at a fourfold degenerate site is referred to as a synonymous nucleotide substitution,
whereas nucleotide substitutions in which the substitution involves the change of a purine to a pyrimidine, or vice versa, are non-synonymous transversion substitutions.
A position of a codon is said to be a non-degenerate site if any mutation at this position results in amino acid substitution. There is only one threefold degenerate site where changing to three of the four nucleotides may have no effect on the amino acid (depending on what it is changed to), while changing to the fourth possible nucleotide always results in an amino acid substitution. This is the third position of an
isoleucine
Isoleucine (symbol Ile or I) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), an α-carboxylic acid group (which is in the deprot ...
codon: AUU, AUC, or AUA all encode isoleucine, but AUG encodes
methionine
Methionine (symbol Met or M) () is an essential amino acid in humans. As the precursor of other amino acids such as cysteine and taurine, versatile compounds such as SAM-e, and the important antioxidant glutathione, methionine plays a critical ro ...
. In computation, this position is often treated as a twofold degenerate site.
There are three amino acids encoded by six different codons:
serine
Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − form un ...
,
leucine
Leucine (symbol Leu or L) is an essential amino acid that is used in the biosynthesis of proteins. Leucine is an α-amino acid, meaning it contains an α-amino group (which is in the protonated −NH3+ form under biological conditions), an α- ca ...
, and
arginine
Arginine is the amino acid with the formula (H2N)(HN)CN(H)(CH2)3CH(NH2)CO2H. The molecule features a guanidino group appended to a standard amino acid framework. At physiological pH, the carboxylic acid is deprotonated (−CO2−) and both the am ...
. Only two amino acids are specified by a single codon each. One of these is the amino-acid
methionine
Methionine (symbol Met or M) () is an essential amino acid in humans. As the precursor of other amino acids such as cysteine and taurine, versatile compounds such as SAM-e, and the important antioxidant glutathione, methionine plays a critical ro ...
, specified by the codon AUG, which also specifies the start of translation; the other is
tryptophan
Tryptophan (symbol Trp or W)
is an α-amino acid that is used in the biosynthesis of proteins. Tryptophan contains an α-amino group, an α- carboxylic acid group, and a side chain indole, making it a polar molecule with a non-polar aromatic ...
, specified by the codon UGG.
See also
*
Neutral theory of molecular evolution
The neutral theory of molecular evolution holds that most evolutionary changes occur at the molecular level, and most of the variation within and between species are due to random genetic drift of mutant alleles that are selectively neutral. The ...
References
{{DEFAULTSORT:Codon degeneracy
Molecular genetics
Gene expression
Protein biosynthesis