Unnatural Amino Acids
   HOME

TheInfoList



OR:

In biochemistry, non-coded or non-proteinogenic
amino acids Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
are distinct from the 22 proteinogenic amino acids (21 in eukaryotesplus formylmethionine in eukaryotes with prokaryote organelles like mitochondria) which are naturally encoded in the genome of organisms for the assembly of proteins. However, over 140 non-proteinogenic amino acids occur naturally in proteins and thousands more may occur in nature or be synthesized in the laboratory. Chemically synthesized amino acids can be called unnatural amino acids. Unnatural amino acids can be synthetically prepared from their native analogs via modifications such as amine alkylation, side chain substitution, structural bond extension cyclization, and isosteric replacements within the amino acid backbone. Many non-proteinogenic amino acids are important: * intermediates in biosynthesis, * in post-translational formation of proteins, * in a physiological role (e.g. components of bacterial cell walls, neurotransmitters and
toxins A toxin is a naturally occurring organic poison produced by metabolic activities of living cells or organisms. Toxins occur especially as a protein or conjugated protein. The term toxin was first used by organic chemist Ludwig Brieger (1849–1 ...
), * natural or man-made pharmacological compounds, * present in meteorites or used in prebiotic experiments (such as the Miller–Urey experiment).


Definition by negation

Technically, any organic compound with an amine (–NH2) and a
carboxylic acid In organic chemistry, a carboxylic acid is an organic acid that contains a carboxyl group () attached to an R-group. The general formula of a carboxylic acid is or , with R referring to the alkyl, alkenyl, aryl, or other group. Carboxylic ...
(–COOH) functional group is an amino acid. The proteinogenic amino acids are small subset of this group that possess central carbon atom (α- or 2-) bearing an amino group, a carboxyl group, a side chain and an α-hydrogen levo conformation, with the exception of glycine, which is
achiral Chirality is a property of asymmetry important in several branches of science. The word ''chirality'' is derived from the Greek (''kheir''), "hand", a familiar chiral object. An object or a system is ''chiral'' if it is distinguishable from i ...
, and
proline Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group but is rather a secondary amine. The secondary amine nitrogen is in the prot ...
, whose amine group is a secondary amine and is consequently frequently referred to as an
imino acid In organic chemistry, an imino acid is any molecule that contains both imine (>C=NH) and carboxyl (-C(=O)-OH) functional groups. Imino acids are structurally related to amino acids, which have amino group instead of imine—a difference o ...
for traditional reasons, albeit not an imino. The genetic code encodes 20 standard amino acids for incorporation into proteins during translation. However, there are two extra proteinogenic amino acids:
selenocysteine Selenocysteine (symbol Sec or U, in older publications also as Se-Cys) is the 21st proteinogenic amino acid. Selenoproteins contain selenocysteine residues. Selenocysteine is an analogue of the more common cysteine with selenium in place of the s ...
and
pyrrolysine Pyrrolysine (symbol Pyl or O; encoded by the 'amber' stop codon UAG) is an α-amino acid that is used in the biosynthesis of proteins in some methanogenic archaea and bacteria; it is not present in humans. It contains an α-amino group (which is ...
. These non-standard amino acids do not have a dedicated codon, but are added in place of a stop codon when a specific sequence is present, UGA codon and SECIS element for selenocysteine, UAG PYLIS downstream sequence for pyrrolysine. All other amino acids are termed "non-proteinogenic". L-selenocysteine-2D-skeletal.png, Selenocysteine. This amino acid contains a selenol group on its β-carbon Pyrrolysine.svg, Pyrrolysine. This amino acid is formed by joining to the ε-amino group of lysine a carboxylated pyrroline ring There are various groups of amino acids: * 20 standard amino acids * 22 proteinogenic amino acids * over 80 amino acids created abiotically in high concentrations * about 900 are produced by natural pathways * over 118 engineered amino acids have been placed into protein These groups overlap, but are not identical. All 22 proteinogenic amino acids are biosynthesised by organisms and some, but not all, of them also are abiotic (found in prebiotic experiments and meteorites). Some natural amino acids, such as norleucine, are misincorporated translationally into proteins due to infidelity of the protein-synthesis process. Many amino acids, such as ornithine, are metabolic intermediates produced biosynthetically, but not incorporated translationally into proteins. Post-translational modification of amino acid residues in proteins leads to the formation of many proteinaceous, but non-proteinogenic, amino acids. Other amino acids are solely found in abiotic mixes (e.g. α-methylnorvaline). Over 30 unnatural amino acids have been inserted translationally into protein in engineered systems, yet are not biosynthetic.


Nomenclature

In addition to the IUPAC numbering system to differentiate the various carbons in an organic molecule, by sequentially assigning a number to each carbon, including those forming a carboxylic group, the carbons along the side-chain of amino acids can also be labelled with Greek letters, where the α-carbon is the central chiral carbon possessing a carboxyl group, a side chain and, in α-amino acids, an amino group – the carbon in carboxylic groups is not counted. (Consequently, the IUPAC names of many non-proteinogenic α-amino acids start with ''2-amino-'' and end in ''-ic acid''.)


Natural non-L-α-amino acids

Most natural amino acids are α-amino acids in the L conformation, but some exceptions exist.


Non-alpha

Some non-α-amino acids exist in organisms. In these structures, the amine group displaced further from the carboxylic acid end of the amino acid molecule. Thus a β-amino acid has the amine group bonded to the second carbon away, and a γ-amino acid has it on the third. Examples include β-alanine, GABA, and δ-
aminolevulinic acid δ-Aminolevulinic acid (also dALA, δ-ALA, 5ALA or 5-aminolevulinic acid), an endogenous non-proteinogenic amino acid, is the first compound in the porphyrin synthesis pathway, the pathway that leads to heme in mammals, as well as chlorophyll in p ...
. Beta-alanine structure.svg, β-alanine: an amino acid produced by
aspartate 1-decarboxylase The enzyme aspartate 1-decarboxylase () catalyzes the chemical reaction :L-aspartate \rightleftharpoons beta-alanine + CO2 Hence, this enzyme has one substrate, L-aspartate, and two products, beta-alanine and CO2. This enzyme belongs to the ...
and a precursor to
coenzyme A Coenzyme A (CoA, SHCoA, CoASH) is a coenzyme, notable for its role in the synthesis and oxidation of fatty acids, and the oxidation of pyruvate in the citric acid cycle. All genomes sequenced to date encode enzymes that use coenzyme A as a subs ...
and the peptides carnosine and anserine. Gamma-Aminobuttersäure - gamma-aminobutyric acid.svg, γ-Aminobutyric acid (GABA): a neurotransmitter in animals. Aminolevulinic_acid.svg, δ-Aminolevulinic acid: an intermediate in tetrapyrrole biosynthesis ( haem,
chlorophyll Chlorophyll (also chlorophyl) is any of several related green pigments found in cyanobacteria and in the chloroplasts of algae and plants. Its name is derived from the Greek words , ("pale green") and , ("leaf"). Chlorophyll allow plants to a ...
, cobalamin etc.). 4-Aminobenzoic_acid.svg,
4-Aminobenzoic acid 4-Aminobenzoic acid (also known as ''para''-aminobenzoic acid or PABA because the two functional groups are attached to the benzene ring across from one another in the ''para'' position) is an organic compound with the formula H2NC6H4CO2H. PABA i ...
(PABA): an intermediate in folate biosynthesis
The reason why α-amino acids are used in proteins has been linked to their frequency in meteorites and prebiotic experiments. An initial speculation on the deleterious properties of β-amino acids in terms of secondary structure turned out to be incorrect.


D-amino acids

Some amino acids contain the opposite absolute chirality, chemicals that are not available from normal ribosomal translation and transcription machinery. Most bacterial cells walls are formed by peptidoglycan, a polymer composed of amino sugars crosslinked with short oligopeptides bridged between each other. The oligopeptide is non-ribosomally synthesised and contains several peculiarities including D-amino acids, generally D-alanine and D-glutamate. A further peculiarity is that the former is racemised by a PLP-binding enzymes (encoded by ''alr'' or the homologue ''dadX''), whereas the latter is racemised by a cofactor independent enzyme (''murI''). Some variants are present, in '' Thermotoga'' spp. D-Lysine is present and in certain vancomycin-resistant bacteria D-serine is present (''vanT'' gene).


Without a hydrogen on the α-carbon

All proteinogenic amino acids have at least one hydrogen on the α-carbon. Glycine has two hydrogens, and all others have one hydrogen and one side-chain. Replacement of the remaining hydrogen with a larger substituent, such as a methyl group, distorts the protein backbone. In some fungi
α-aminoisobutyric acid 2-Aminoisobutyric acid (also known as α-aminoisobutyric acid, AIB, α-methylalanine, or 2-methylalanine) is the non-proteinogenic amino acid with the structural formula H2N-C(CH3)2-COOH. It is rare in nature, having been only found in meteorites, ...
is produced as a precursor to peptides, some of which exhibit antibiotic properties. This compound is similar to alanine, but possesses an additional methyl group on the α-carbon instead of a hydrogen. It is therefore achiral. Another compound similar to alanine without an α-hydrogen is dehydroalanine, which possess a methylene sidechain. It is one of several naturally occurring dehydroamino acids. L-Alanin - L-Alanine.svg, alanine 2-aminoisobutyric acid.svg, aminoisobutyric acid Dehydroalanin.svg, dehydroalanine


Twin amino acid stereocentres

A subset of L-α-amino acids are ambiguous as to which of two ends is the α-carbon. In proteins a
cysteine Cysteine (symbol Cys or C; ) is a semiessential proteinogenic amino acid with the formula . The thiol side chain in cysteine often participates in enzymatic reactions as a nucleophile. When present as a deprotonated catalytic residue, sometime ...
residue can form a disulfide bond with another cysteine residue, thus crosslinking the protein. Two crosslinked cysteines form a cystine molecule. Cysteine and methionine are generally produced by direct sulfurylation, but in some species they can be produced by transsulfuration, where the activated homoserine or
serine Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − form un ...
is fused to a
cysteine Cysteine (symbol Cys or C; ) is a semiessential proteinogenic amino acid with the formula . The thiol side chain in cysteine often participates in enzymatic reactions as a nucleophile. When present as a deprotonated catalytic residue, sometime ...
or homocysteine forming cystathionine. A similar compound is lanthionine, which can be seen as two alanine molecules joined via a thioether bond and is found in various organisms. Similarly, djenkolic acid, a plant toxin from jengkol beans, is composed of two cysteines connected by a methylene group. Diaminopimelic acid is both used as a bridge in peptidoglycan and is used a precursor to lysine (via its decarboxylation). Cystine-skeletal.png, cystine Cystathionin.svg, cystathionine Lanthionin.svg, lanthionine Djenkolic acid.svg, djenkolic acid Diaminopimelic acid.svg, diaminopimelic acid


Prebiotic amino acids and alternative biochemistries

In meteorites and in prebiotic experiments (e.g. Miller–Urey experiment) many more amino acids than the twenty standard amino acids are found, several of which at higher concentrations than the standard ones: it has been conjectured that if amino acid based life were to arise in parallel elsewhere in the universe, no more than 75% of the amino acids would be in common. The most notable anomaly is the lack of aminobutyric acid.


Straight side chain

The genetic code has been described as a frozen accident and the reasons why there is only one standard amino acid with a straight chain (alanine) could simply be redundancy with valine, leucine and isoleucine. However, straight chained amino acids are reported to form much more stable alpha helices. Glycin - Glycine.svg, glycine (hydrogen side-chain) L-Alanin - L-Alanine.svg, alanine (methyl side-chain) Alpha-aminobutyric acid.png, homoalanine, or α-aminobutyric acid (ethyl side-chain) L-Norvalin.svg, norvaline (''n''-propyl side-chain) L-Norleucin.svg, norleucine (''n''-butyl side-chain) Heptanoic acid.png, homonorleucine (''n''-Pentyl side-chain, heptanoic acid shown)


Chalcogen

Serine, homoserine, ''O''-methylhomoserine and ''O''-ethylhomoserine possess a hydroxymethyl, hydroxyethyl, ''O''-methylhydroxymethyl and ''O''-methylhydroxyethyl side chain; whereas cysteine, homocysteine, methionine and
ethionine Ethionine is a non-proteinogenic amino acid structurally related to methionine, with an ethyl group in place of the methyl group. Ethionine is an antimetabolite and methionine antagonist. It prevents amino acid incorporation into proteins and ...
possess the thiol equivalents. The selenol equivalents are selenocysteine, selenohomocysteine, selenomethionine and selenoethionine. Amino acids with the next chalcogen down are also found in nature: several species such as ''
Aspergillus fumigatus ''Aspergillus fumigatus'' is a species of fungus in the genus ''Aspergillus'', and is one of the most common ''Aspergillus'' species to cause disease in individuals with an immunodeficiency. ''Aspergillus fumigatus'', a saprotroph widespread in ...
'', '' Aspergillus terreus'', and '' Penicillium chrysogenum'' in the absence of sulfur are able to produce and incorporate into protein tellurocysteine and telluromethionine.


Expanded genetic code


Roles

In cells, especially autotrophs, several non-proteinogenic amino acids are found as metabolic intermediates. However, despite the catalytic flexibility of PLP-binding enzymes, many amino acids are synthesised as keto acids (such as 4-methyl-2-oxopentanoate to leucine) and aminated in the last step, thus keeping the number of non-proteinogenic amino acid intermediates fairly low. Ornithine and citrulline occur in the urea cycle, part of amino acid
catabolism Catabolism () is the set of metabolic pathways that breaks down molecules into smaller units that are either oxidized to release energy or used in other anabolic reactions. Catabolism breaks down large molecules (such as polysaccharides, lipids, ...
(see below). In addition to
primary metabolism Metabolism (, from el, μεταβολή ''metabolē'', "change") is the set of life-sustaining chemical reactions in organisms. The three main functions of metabolism are: the conversion of the energy in food to energy available to run ...
, several non-proteinogenic amino acids are precursors or the final production in secondary metabolism to make small compounds or non-ribosomal peptides (such as some
toxins A toxin is a naturally occurring organic poison produced by metabolic activities of living cells or organisms. Toxins occur especially as a protein or conjugated protein. The term toxin was first used by organic chemist Ludwig Brieger (1849–1 ...
).


Post-translationally incorporated into protein

Despite not being encoded by the genetic code as proteinogenic amino acids, some non-standard amino acids are nevertheless found in proteins. These are formed by post-translational modification of the side chains of standard amino acids present in the target protein. These modifications are often essential for the function or regulation of a protein; for example, in γ-carboxyglutamate the carboxylation of
glutamate Glutamic acid (symbol Glu or E; the ionic form is known as glutamate) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a non-essential nutrient for humans, meaning that the human body can syn ...
allows for better binding of calcium cations, and in hydroxyproline the
hydroxylation In chemistry, hydroxylation can refer to: *(i) most commonly, hydroxylation describes a chemical process that introduces a hydroxyl group () into an organic compound. *(ii) the ''degree of hydroxylation'' refers to the number of OH groups in a ...
of
proline Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group but is rather a secondary amine. The secondary amine nitrogen is in the prot ...
is critical for maintaining connective tissues. Another example is the formation of hypusine in the translation initiation factor EIF5A, through modification of a lysine residue. Such modifications can also determine the localization of the protein, for example, the addition of long hydrophobic groups can cause a protein to bind to a
phospholipid Phospholipids, are a class of lipids whose molecule has a hydrophilic "head" containing a phosphate group and two hydrophobic "tails" derived from fatty acids, joined by an alcohol residue (usually a glycerol molecule). Marine phospholipids typ ...
membrane. Carboxyglutamic_acid.png, Carboxyglutamic acid. Whereas glutamic acid possess one γ-carboxyl group, Carboxyglutamic acid possess two. Hydroxyproline structure.svg, Hydroxyproline. This imino acid differs from proline due to a hydroxyl group on carbon 4. Hypusine natural.svg, Hypusine. This amino acid is obtained by adding to the ε-amino group of a lysine a 4-aminobutyl moiety (obtained from spermidine) (S)-Pyroglutamic_acid_Structural_Formulae.png, Pyroglutamic acid There is some preliminary evidence that aminomalonic acid may be present, possibly by misincorporation, in protein.


Toxic analogues

Several non-proteinogenic amino acids are toxic due to their ability to mimic certain properties of proteinogenic amino acids, such as thialysine. Some non-proteinogenic amino acids are neurotoxic by mimicking amino acids used as neurotransmitters (that is, not for protein biosynthesis), including quisqualic acid,
canavanine L-(+)-(''S'')-Canavanine is a non-proteinogenic amino acid found in certain leguminous plants. It is structurally related to the proteinogenic α-amino acid L- arginine, the sole difference being the replacement of a methylene bridge (-- unit) in ...
and azetidine-2-carboxylic acid. Cephalosporin C has an α-aminoadipic acid (homoglutamate) backbone that is amidated with a cephalosporin moiety. Penicillamine is a therapeutic amino acid, whose mode of action is unknown. Thialysine.png, Thialysine Quisqualic acid.svg, quisqualic acid L-S-Canavanine.svg, canavanine S-(-)-Azetidine-2-carboxylate.png, azetidine-2-carboxylic acid Cephalosporin C.svg, cephalosporin C Penicillamine structure.png, penicillamine Naturally-occurring cyanotoxins can also include non-proteinogenic amino acids. Microcystin and nodularin, for example, are both derived from ADDA, a β-amino acid.


Not amino acids

Taurine is an
amino sulfonic acid In chemistry, amines (, ) are chemical compound, compounds and functional groups that contain a base (chemistry), basic nitrogen atom with a lone pair. Amines are formally derivative (chemistry), derivatives of ammonia (), wherein one or mo ...
and not an amino carboxylic acid, however it is occasionally considered as such as the amounts required to suppress the auxotroph in certain organisms (such as cats) are closer to those of "essential amino acids" (amino acid auxotrophy) than of vitamins (cofactor auxotrophy). The osmolytes, sarcosine and glycine betaine are derived from amino acids, but have a secondary and quaternary amine respectively.


Notes


References

{{Non-proteinogenic amino acids Amino acids