HOME

TheInfoList



OR:

The 5′ untranslated region (also known as 5′ UTR, leader sequence, transcript leader, or leader RNA) is the region of a
messenger RNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the p ...
(mRNA) that is directly
upstream Upstream may refer to: * Upstream (bioprocess) * ''Upstream'' (film), a 1927 film by John Ford * Upstream (networking) * ''Upstream'' (newspaper), a newspaper covering the oil and gas industry * Upstream (petroleum industry) * Upstream (software ...
from the
initiation codon The start codon is the first codon of a messenger RNA (mRNA) transcript translated by a ribosome. The start codon always codes for methionine in eukaryotes and Archaea and a N-formylmethionine (fMet) in bacteria, mitochondria and plastids ...
. This region is important for the regulation of
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
of a transcript by differing mechanisms in
virus A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Since Dmitri Ivanovsky's 1 ...
es,
prokaryotes A prokaryote () is a single-celled organism that lacks a nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Greek πρό (, 'before') and κάρυον (, 'nut' or 'kernel').Campbell, N. "Biology:Concepts & Connec ...
and
eukaryotes Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
. While called untranslated, the 5′ UTR or a portion of it is sometimes translated into a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
product. This product can then regulate the translation of the main
coding sequence The coding region of a gene, also known as the coding sequence (CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to no ...
of the mRNA. In many organisms, however, the 5′ UTR is completely untranslated, instead forming a complex
secondary structure Protein secondary structure is the three dimensional conformational isomerism, form of ''local segments'' of proteins. The two most common Protein structure#Secondary structure, secondary structural elements are alpha helix, alpha helices and beta ...
to regulate translation. The 5′ UTR has been found to interact with proteins relating to metabolism, and within the 5′ UTR. In addition, this region has been involved in
transcription Transcription refers to the process of converting sounds (voice, music etc.) into letters or musical notes, or producing a copy of something in another medium, including: Genetics * Transcription (biology), the copying of DNA into RNA, the fir ...
regulation, such as the sex-lethal gene in ''
Drosophila ''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many species ...
''. Regulatory elements within 5′ UTRs have also been linked to mRNA export.


General structure


Length

The 5′ UTR begins at the
transcription start site Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called ...
and ends one
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules wi ...
(nt) before the
initiation sequence Initiation is a rite of passage marking entrance or acceptance into a group or society. It could also be a formal admission to adulthood in a community or one of its formal components. In an extended sense, it can also signify a transformation ...
(usually AUG) of the coding region. In prokaryotes, the length of the 5′ UTR tends to be 3–10 nucleotides long, while in eukaryotes it tends to be anywhere from 100 to several thousand nucleotides long. For example, the ''ste11'' transcript in ''
Schizosaccharomyces pombe ''Schizosaccharomyces pombe'', also called "fission yeast", is a species of yeast used in traditional brewing and as a model organism in molecular and cell biology. It is a unicellular eukaryote, whose cells are rod-shaped. Cells typically meas ...
'' has a 2273 nucleotide 5′ UTR while the ''lac'' operon in ''
Escherichia coli ''Escherichia coli'' (),Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. also known as ''E. coli'' (), is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus ''Escher ...
'' only has seven nucleotides in its 5′ UTR. The differing sizes are likely due to the complexity of the eukaryotic regulation which the 5′ UTR holds as well as the larger pre-initiation complex that must form to begin translation. The 5′ UTR can also be completely missing, in the case of leaderless mRNAs.
Ribosomes Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to f ...
of all three domains of life accept and translate such mRNAs. Such sequences are naturally found in all three domains of life. Humans have many pressure-related genes under a 2–3 nucleotide leader. Mammals also have other types of ultra-short leaders like the TISU sequence.


Elements

The elements of a eukaryotic and prokaryotic 5′ UTR differ greatly. The prokaryotic 5′ UTR contains a
ribosome binding site A ribosome binding site, or ribosomal binding site (RBS), is a sequence of nucleotides upstream of the start codon of an mRNA transcript that is responsible for the recruitment of a ribosome during the initiation of translation. Mostly, RBS refers t ...
(RBS), also known as the
Shine–Dalgarno sequence The Shine–Dalgarno (SD) sequence is a ribosomal binding site in bacterial and archaeal messenger RNA, generally located around 8 bases upstream of the start codon AUG. The RNA sequence helps recruit the ribosome to the messenger RNA (mRNA) to in ...
(AGGAGGU), which is usually 3–10
base pairs A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
upstream from the initiation codon. In contrast, the eukaryotic 5′ UTR contains the
Kozak consensus sequence The Kozak consensus sequence (Kozak consensus or Kozak sequence) is a nucleic acid motif that functions as the protein translation initiation site in most eukaryotic mRNA transcripts. Regarded as the optimum sequence for initiating translation in ...
(ACCAUGG), which contains the initiation codon. The eukaryotic 5′ UTR also contains ''cis''-acting regulatory elements called upstream open reading frames (uORFs) and upstream AUGs (uAUGs) and termination codons, which have a great impact on the regulation of translation ( see below). Unlike prokaryotes, 5′ UTRs can harbor
introns An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e. a region inside a gene."The notion of the cistron .e., gene. ...
in eukaryotes. In humans, ~35% of all genes harbor introns within the 5′ UTR.


Secondary structure

As the 5′ UTR has high
GC content In molecular biology and genetics, GC-content (or guanine-cytosine content) is the percentage of nitrogenous bases in a DNA or RNA molecule that are either guanine (G) or cytosine (C). This measure indicates the proportion of G and C bases out o ...
,
secondary structures Secondary may refer to: Science and nature * Secondary emission, of particles ** Secondary electrons, electrons generated as ionization products * The secondary winding, or the electrical or electronic circuit connected to the secondary winding ...
often occur within it.
Hairpin loop Stem-loop intramolecular base pairing is a pattern that can occur in single-stranded RNA. The structure is also known as a hairpin or hairpin loop. It occurs when two regions of the same strand, usually complementary in nucleotide sequence wh ...
s are one such secondary structure that can be located within the 5′ UTR. These secondary structures also impact the regulation of
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
.


Role in translational regulation


Prokaryotes

In
bacteria Bacteria (; singular: bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were among ...
, the initiation of translation occurs when IF-3, along with the 30S ribosomal subunit, bind to the Shine–Dalgarno (SD) sequence of the 5′ UTR. This then recruits many other proteins, such as the 50S ribosomal subunit, which allows for translation to begin. Each of these steps regulates the initiation of translation. Initiation in
Archaea Archaea ( ; singular archaeon ) is a domain of single-celled organisms. These microorganisms lack cell nuclei and are therefore prokaryotes. Archaea were initially classified as bacteria, receiving the name archaebacteria (in the Archaebac ...
is less understood. SD sequences are much rarer, and the initiation factors have more in common with eukaryotic ones. There is no homolog of bacterial IF3. Some mRNAs are leaderless. In both domains, genes without Shine–Dalgarno sequences are also translated in a less understood manner. A requirement seems to be a lack of secondary structure near the initiation codon.


Eukaryotes


Pre-initiation complex regulation

The regulation of translation in eukaryotes is more complex than in prokaryotes. Initially, the
eIF4F Eukaryotic initiation factor 4F (eIF4F) is a heterotrimeric protein complex that binds the 5' cap of messenger RNAs (mRNAs) to promote eukaryotic translation initiation. The eIF4F complex is composed of three non-identical subunits: the DEAD- ...
complex is recruited to the 5′ cap, which in turn recruits the ribosomal complex to the 5′ UTR. Both
eIF4E Eukaryotic translation initiation factor 4E, also known as eIF4E, is a protein that in humans is encoded by the ''EIF4E'' gene. Structure and function Most eukaryotic cellular mRNAs are blocked at their 5'-ends with the 7-methyl-guanosine fi ...
and
eIF4G Eukaryotic translation initiation factor 4 G (eIF4G) is a protein involved in eukaryotic translation initiation and is a component of the eIF4F cap-binding complex. Orthologs of eIF4G have been studied in multiple species, including humans, yeast ...
bind the 5′ UTR, which limits the rate at which translational initiation can occur. However, this is not the only regulatory step of
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
that involves the 5′ UTR.
RNA-binding protein RNA-binding proteins (often abbreviated as RBPs) are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes. RBPs contain various structural motifs, such as RNA recognition motif ( ...
s sometimes serve to prevent the pre-initiation complex from forming. An example is regulation of the ''msl2'' gene. The protein SXL attaches to an intron segment located within the 5′ UTR segment of the primary transcript, which leads to the inclusion of the intron after processing. This sequence allows the recruitment of proteins that bind simultaneously to both the 5′ and 3′ UTR, not allowing translation proteins to assemble. However, it has also been noted that SXL can also repress translation of RNAs that do not contain a
poly(A) tail Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In euk ...
, or more generally, 3′ UTR.


Closed-loop regulation

Another important regulator of translation is the interaction between 3′ UTR and the 5′ UTR. The closed-loop structure inhibits translation. This has been observed in ''
Xenopus laevis The African clawed frog (''Xenopus laevis'', also known as the xenopus, African clawed toad, African claw-toed frog or the ''platanna'') is a species of African aquatic frog of the family Pipidae. Its name is derived from the three short claws ...
'', in which eIF4E bound to the 5′ cap interacts with Maskin bound to
CPEB CPEB, or cytoplasmic polyadenylation element binding protein, is a highly conserved RNA-binding protein that promotes the elongation of the polyadenine tail of messenger RNA. CPEB most commonly activates the target RNA for translation, but ca ...
on the 3′ UTR, creating translationally inactive transcripts. This translational inhibition is lifted once CPEB is
phosphorylated In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, whi ...
, displacing the Maskin binding site, allowing for the
polymerization In polymer chemistry, polymerization (American English), or polymerisation (British English), is a process of reacting monomer, monomer molecules together in a chemical reaction to form polymer chains or three-dimensional networks. There are ...
of the PolyA tail, which can recruit the translational machinery by means of PABP. However, it is important to note that this mechanism has been under great scrutiny.


Ferritin regulation

Iron levels in cells are maintained by translation regulation of many proteins involved in iron storage and metabolism. The 5′ UTR has the ability to form a hairpin loop secondary structure (known as the
iron response element In molecular biology, the iron response element or iron-responsive element (IRE) is a short conserved stem-loop which is bound by iron response proteins (IRPs, also named IRE-BP or IRBP). The IRE is found in UTRs (untranslated regions) of vari ...
or IRE) that is recognized by iron-regulatory proteins (IRP1 and IRP2). In low levels of iron, the ORF of the target mRNA is blocked as a result of
steric hindrance Steric effects arise from the spatial arrangement of atoms. When atoms come close together there is a rise in the energy of the molecule. Steric effects are nonbonding interactions that influence the shape ( conformation) and reactivity of ions ...
from the binding of IRP1 and IRP2 to the IRE. When iron is high, then the two iron-regulatory proteins do not bind as strongly and allow proteins to be expressed that have a role in iron concentration control. This function has gained some interest after it was revealed that the translation of
amyloid precursor protein Amyloid-beta precursor protein (APP) is an integral membrane protein expressed in many biological tissue, tissues and concentrated in the synapses of neurons. It functions as a cell surface receptor and has been implicated as a regulator ...
may be disrupted due to a single-nucleotide polymorphism to the IRE found in the 5′ UTR of its
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein. mRNA is ...
, leading to a spontaneous increased risk of
Alzheimer's disease Alzheimer's disease (AD) is a neurodegeneration, neurodegenerative disease that usually starts slowly and progressively worsens. It is the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in short-term me ...
.


uORFs and reinitiation

Another form of translational regulation in eukaryotes comes from unique elements on the 5′ UTR called upstream open reading frames (uORF). These elements are fairly common, occurring in 35–49% of all human genes. A uORF is a coding sequence located in the 5′ UTR located upstream of the coding sequences initiation site. These uORFs contain their own initiation codon, known as an upstream AUG (uAUG). This
codon The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
can be scanned for by ribosomes and then translated to create a product, which can regulate the translation of the main protein coding sequence or other uORFs that may exist on the same transcript. The translation of the protein within the main ORF after a uORF sequence has been translated is known as reinitiation. The process of reinitiation is known to reduce the translation of the ORF protein. Control of protein regulation is determined by the distance between the uORF and the first codon in the main ORF. A uORF has been found to increase reinitiation with the longer distance between its uAUG and the start codon of the main ORF, which indicates that the ribosome needs to reacquire translation factors before it can carry out translation of the main protein. For example, ''
ATF4 Activating transcription factor 4 (tax-responsive enhancer element B67), also known as ATF4, is a protein that in humans is encoded by the ''ATF4'' gene. Function This gene encodes a transcription factor that was originally identified as a wi ...
'' regulation is performed by two uORFs further upstream, named uORF1 and uORF2, which contain three amino acids and fifty-nine amino acids, respectively. The location of uORF2 overlaps with the ''ATF4'' ORF. During normal conditions, the uORF1 is translated, and then translation of uORF2 occurs only after
eIF2 Eukaryotic Initiation Factor 2 (eIF2) is an eukaryotic initiation factor. It is required for most forms of eukaryotic translation initiation. eIF2 mediates the binding of tRNAiMet to the ribosome in a GTP-dependent manner. eIF2 is a heterotrimer c ...
-TC has been reacquired. Translation of the uORF2 requires that the ribosomes pass by the ''ATF4'' ORF, whose start codon is located within uORF2. This leads to its repression. However, during stress conditions, the
40S The eukaryotic small ribosomal subunit (40S) is the smaller subunit of the eukaryotic 80S ribosomes, with the other major component being the large ribosomal subunit (60S). The "40S" and "60S" names originate from the convention that ribosomal pa ...
ribosome will bypass uORF2 because of a decrease in concentration of eIF2-TC, which means the ribosome does not acquire one in time to translate uORF2. Instead, ''ATF4'' is translated.


= Other mechanisms

= In addition to reinitiation, uORFs contribute to translation initiation based on: * The nucleotides of an uORF may code for a codon that leads to a highly structured mRNA, causing the ribosome to stall. * cis- and trans- regulation on translation of the main protein coding sequence. * Interactions with IRES sites.


Internal ribosome entry sites and viruses

Viral (as well as some eukaryotic) 5′ UTRs contain
internal ribosome entry site An internal ribosome entry site, abbreviated IRES, is an RNA element that allows for translation initiation in a cap-independent manner, as part of the greater process of protein synthesis. In eukaryotic translation, initiation typically occurs at t ...
s, which is a cap-independent method of translational activation. Instead of building up a complex at the 5′ cap, the IRES allows for direct binding of the ribosomal complexes to the transcript to begin translation. The IRES enables the viral transcript to translate more efficiently due to the lack of needing a preinitation complex, allowing the virus to replicate quickly.


Role in transcriptional regulation


''msl-2'' transcript

Transcription of the '' msl-2'' transcript is regulated by multiple binding sites for fly '' Sxl'' at the 5′ UTR. In particular, these poly-
uracil Uracil () (symbol U or Ura) is one of the four nucleobases in the nucleic acid RNA. The others are adenine (A), cytosine (C), and guanine (G). In RNA, uracil binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced by ...
sites are located close to a small intron that is spliced in males, but kept in females through splicing inhibition. This splicing inhibition is maintained by ''Sxl''. When present, ''Sxl'' will repress the translation of ''msl2'' by increasing translation of a start codon located in a uORF in the 5′ UTR ( see above for more information on uORFs). Also, ''Sxl'' outcompetes TIA-1 to a poly(U) region and prevents snRNP (a step in
alternative splicing Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be ...
) recruitment to the 5′ splice site.


See also

*
Three prime untranslated region In molecular genetics, the three prime untranslated region (3′-UTR) is the section of messenger RNA (mRNA) that immediately follows the translation termination codon. The 3′-UTR often contains regulatory regions that post-transcriptionally ...
* UORF *
Iron-responsive element-binding protein The iron-responsive element-binding proteins, also known as IRE-BP, IRBP, IRP and IFR , bind to iron-responsive elements (IREs) in the regulation of human iron metabolism. Function ACO1, or IRP1, is a bifunctional protein that functions as an ...
*
Iron response element In molecular biology, the iron response element or iron-responsive element (IRE) is a short conserved stem-loop which is bound by iron response proteins (IRPs, also named IRE-BP or IRBP). The IRE is found in UTRs (untranslated regions) of vari ...
*
Trans-splicing ''Trans''-splicing is a special form of RNA processing where exons from two different primary RNA transcripts are joined end to end and ligated. It is usually found in eukaryotes and mediated by the spliceosome, although some bacteria and archaea ...
*
UTRdb UTRdb is a database of 5' and 3' untranslated sequences of eukaryotic mRNAs See also * Five prime untranslated region * Three prime untranslated region In molecular genetics, the three prime untranslated region (3′-UTR) is the section of ...


References

{{Reflist RNA Gene expression