The 5′ untranslated region (also known as 5′ UTR, leader sequence, transcript leader, or leader RNA) is the region of a
messenger RNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.
mRNA is created during the p ...
(mRNA) that is directly
upstream from the
initiation codon. This region is important for the regulation of
translation of a transcript by differing mechanisms in
viruses,
prokaryotes
A prokaryote () is a single-celled organism that lacks a nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Greek πρό (, 'before') and κάρυον (, 'nut' or 'kernel').Campbell, N. "Biology:Concepts & Connec ...
and
eukaryotes
Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
. While called untranslated, the 5′ UTR or a portion of it is sometimes translated into a
protein product. This product can then regulate the translation of the main
coding sequence of the mRNA. In many organisms, however, the 5′ UTR is completely untranslated, instead forming a complex
secondary structure
Protein secondary structure is the three dimensional conformational isomerism, form of ''local segments'' of proteins. The two most common Protein structure#Secondary structure, secondary structural elements are alpha helix, alpha helices and beta ...
to regulate translation.
The 5′ UTR has been found to interact with proteins relating to metabolism, and within the 5′ UTR. In addition, this region has been involved in
transcription regulation, such as the
sex-lethal
Sex-lethal (''Sxl'') is a gene found in Dipteran insects, named for its mutation phenotype in ''Drosophila melanogaster'' (). It is most closely related to the ELAV/HUD subfamily of splicing factors.
In fruit flies, this protein participates in a ...
gene in ''
Drosophila''.
Regulatory elements within 5′ UTRs have also been linked to mRNA export.
General structure
Length
The 5′ UTR begins at the
transcription start site and ends one
nucleotide (nt) before the
initiation sequence (usually AUG) of the coding region. In prokaryotes, the length of the 5′ UTR tends to be 3–10 nucleotides long, while in eukaryotes it tends to be anywhere from 100 to several thousand nucleotides long. For example, the ''ste11'' transcript in ''
Schizosaccharomyces pombe'' has a 2273 nucleotide 5′ UTR while the
''lac'' operon in ''
Escherichia coli'' only has seven nucleotides in its 5′ UTR.
The differing sizes are likely due to the complexity of the eukaryotic regulation which the 5′ UTR holds as well as the larger
pre-initiation complex that must form to begin translation.
The 5′ UTR can also be completely missing, in the case of leaderless mRNAs.
Ribosomes of all three
domains of life accept and translate such mRNAs. Such sequences are naturally found in all three domains of life. Humans have many pressure-related genes under a 2–3 nucleotide leader. Mammals also have other types of ultra-short leaders like the
TISU sequence.
Elements
The elements of a eukaryotic and prokaryotic 5′ UTR differ greatly. The prokaryotic 5′ UTR contains a
ribosome binding site A ribosome binding site, or ribosomal binding site (RBS), is a sequence of nucleotides upstream of the start codon of an mRNA transcript that is responsible for the recruitment of a ribosome during the initiation of translation. Mostly, RBS refers t ...
(RBS), also known as the
Shine–Dalgarno sequence (AGGAGGU), which is usually 3–10
base pairs
A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
upstream from the initiation codon.
In contrast, the eukaryotic 5′ UTR contains the
Kozak consensus sequence (ACCAUGG), which contains the initiation codon.
The eukaryotic 5′ UTR also contains
''cis''-acting regulatory elements called
upstream open reading frames (uORFs) and upstream AUGs (uAUGs) and termination codons, which have a great impact on the regulation of translation (
see below). Unlike prokaryotes, 5′ UTRs can harbor
introns in eukaryotes. In humans, ~35% of all genes harbor introns within the 5′ UTR.
Secondary structure
As the 5′ UTR has high
GC content,
secondary structures often occur within it.
Hairpin loop
Stem-loop intramolecular base pairing is a pattern that can occur in single-stranded RNA. The structure is also known as a hairpin or hairpin loop. It occurs when two regions of the same strand, usually complementary in nucleotide sequence whe ...
s are one such secondary structure that can be located within the 5′ UTR. These secondary structures also impact the regulation of
translation.
Role in translational regulation
Prokaryotes
In
bacteria, the initiation of translation occurs when
IF-3, along with the
30S ribosomal subunit
The prokaryotic small ribosomal subunit, or 30 S subunit, is the smaller subunit of the 70S ribosome found in prokaryotes. It is a complex of the 16S ribosomal RNA (rRNA) and 19 proteins. This complex is implicated in the binding of transfer RN ...
, bind to the Shine–Dalgarno (SD) sequence of the 5′ UTR.
This then recruits many other proteins, such as the
50S ribosomal subunit
50 S is the larger subunit of the 70S ribosome of prokaryotes, i.e. bacteria and archaea. It is the site of inhibition for antibiotics such as macrolides, chloramphenicol, clindamycin, and the pleuromutilins. It includes the 5S ribosomal RNA a ...
, which allows for translation to begin. Each of these steps regulates the initiation of translation.
Initiation in
Archaea
Archaea ( ; singular archaeon ) is a domain of single-celled organisms. These microorganisms lack cell nuclei and are therefore prokaryotes. Archaea were initially classified as bacteria, receiving the name archaebacteria (in the Archaebac ...
is less understood. SD sequences are much rarer, and the initiation factors have more in common with eukaryotic ones. There is no homolog of bacterial IF3.
Some mRNAs are leaderless.
In both domains, genes without Shine–Dalgarno sequences are also translated in a less understood manner. A requirement seems to be a lack of secondary structure near the initiation codon.
Eukaryotes
Pre-initiation complex regulation
The regulation of translation in eukaryotes is more complex than in prokaryotes. Initially, the
eIF4F complex is recruited to the
5′ cap
In molecular biology, the five-prime cap (5′ cap) is a specially altered nucleotide on the 5′ end of some primary transcripts such as precursor messenger RNA. This process, known as mRNA capping, is highly regulated and vital in the creation o ...
, which in turn recruits the ribosomal complex to the 5′ UTR. Both
eIF4E and
eIF4G bind the 5′ UTR, which limits the rate at which translational initiation can occur. However, this is not the only regulatory step of
translation that involves the 5′ UTR.
RNA-binding proteins sometimes serve to prevent the pre-initiation complex from forming. An example is regulation of the ''msl2'' gene. The protein SXL attaches to an intron segment located within the 5′ UTR segment of the primary transcript, which leads to the inclusion of the intron after processing. This sequence allows the recruitment of proteins that bind simultaneously to both the 5′ and
3′ UTR
In molecular genetics, the three prime untranslated region (3′-UTR) is the section of messenger RNA (mRNA) that immediately follows the translation termination codon. The 3′-UTR often contains regulatory regions that post-transcriptionally ...
, not allowing translation proteins to assemble. However, it has also been noted that SXL can also repress translation of RNAs that do not contain a
poly(A) tail, or more generally, 3′ UTR.
Closed-loop regulation
Another important regulator of translation is the interaction between 3′ UTR and the 5′ UTR.
The closed-loop structure inhibits translation. This has been observed in ''
Xenopus laevis'', in which eIF4E bound to the 5′ cap interacts with Maskin bound to
CPEB on the 3′ UTR, creating translationally inactive
transcripts. This translational inhibition is lifted once CPEB is
phosphorylated, displacing the Maskin binding site, allowing for the
polymerization of the PolyA tail, which can recruit the translational machinery by means of
PABP. However, it is important to note that this mechanism has been under great scrutiny.
Ferritin regulation
Iron levels in cells are maintained by translation regulation of many proteins involved in iron storage and metabolism. The 5′ UTR has the ability to form a hairpin loop secondary structure (known as the
iron response element or IRE) that is recognized by iron-regulatory proteins (IRP1 and IRP2). In low levels of iron, the ORF of the target mRNA is blocked as a result of
steric hindrance from the binding of IRP1 and IRP2 to the IRE. When iron is high, then the two iron-regulatory proteins do not bind as strongly and allow proteins to be expressed that have a role in iron concentration control. This function has gained some interest after it was revealed that the translation of
amyloid precursor protein may be disrupted due to a single-nucleotide polymorphism to the IRE found in the 5′ UTR of its
mRNA, leading to a spontaneous increased risk of
Alzheimer's disease
Alzheimer's disease (AD) is a neurodegeneration, neurodegenerative disease that usually starts slowly and progressively worsens. It is the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in short-term me ...
.
uORFs and reinitiation
Another form of translational regulation in eukaryotes comes from unique elements on the 5′ UTR called upstream open reading frames (uORF). These elements are fairly common, occurring in 35–49% of all human genes. A uORF is a coding sequence located in the 5′ UTR located upstream of the coding sequences initiation site. These uORFs contain their own initiation codon, known as an upstream AUG (uAUG). This
codon
The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
can be scanned for by ribosomes and then translated to create a product, which can regulate the translation of the main protein coding sequence or other uORFs that may exist on the same transcript.
The translation of the protein within the main ORF after a uORF sequence has been translated is known as reinitiation.
The process of reinitiation is known to reduce the translation of the ORF protein. Control of protein regulation is determined by the distance between the uORF and the first codon in the main ORF.
A uORF has been found to increase reinitiation with the longer distance between its uAUG and the start codon of the main ORF, which indicates that the ribosome needs to reacquire translation factors before it can carry out translation of the main protein.
For example, ''
ATF4
Activating transcription factor 4 (tax-responsive enhancer element B67), also known as ATF4, is a protein that in humans is encoded by the ''ATF4'' gene.
Function
This gene encodes a transcription factor that was originally identified as a wi ...
'' regulation is performed by two uORFs further upstream, named uORF1 and uORF2, which contain three amino acids and fifty-nine amino acids, respectively. The location of uORF2 overlaps with the ''ATF4'' ORF. During normal conditions, the uORF1 is translated, and then translation of uORF2 occurs only after
eIF2-TC has been reacquired. Translation of the uORF2 requires that the ribosomes pass by the ''ATF4'' ORF, whose start codon is located within uORF2. This leads to its repression. However, during stress conditions, the
40S
The eukaryotic small ribosomal subunit (40S) is the smaller subunit of the eukaryotic 80S ribosomes, with the other major component being the large ribosomal subunit (60S). The "40S" and "60S" names originate from the convention that ribosomal pa ...
ribosome will bypass uORF2 because of a decrease in concentration of eIF2-TC, which means the ribosome does not acquire one in time to translate uORF2. Instead, ''ATF4'' is translated.
= Other mechanisms
=
In addition to reinitiation, uORFs contribute to translation initiation based on:
* The nucleotides of an uORF may code for a codon that leads to a highly structured mRNA, causing the ribosome to stall.
* cis- and trans- regulation on translation of the main protein coding sequence.
* Interactions with
IRES sites.
Internal ribosome entry sites and viruses
Viral
Viral means "relating to viruses" (small infectious agents).
Viral may also refer to:
Viral behavior, or virality
Memetic behavior likened that of a virus, for example:
* Viral marketing, the use of existing social networks to spread a marke ...
(as well as some eukaryotic) 5′ UTRs contain
internal ribosome entry sites, which is a cap-independent method of translational activation. Instead of building up a complex at the 5′ cap, the IRES allows for direct binding of the ribosomal complexes to the transcript to begin translation.
The IRES enables the viral transcript to translate more efficiently due to the lack of needing a preinitation complex, allowing the virus to replicate quickly.
Role in transcriptional regulation
''msl-2'' transcript
Transcription of the ''
msl-2'' transcript is regulated by multiple binding sites for fly ''
Sxl'' at the 5′ UTR.
In particular, these poly-
uracil sites are located close to a small intron that is spliced in males, but kept in females through splicing inhibition. This splicing inhibition is maintained by ''Sxl''.
When present, ''Sxl'' will repress the translation of ''msl2'' by increasing translation of a start codon located in a uORF in the 5′ UTR (
see above for more information on uORFs). Also, ''Sxl'' outcompetes TIA-1 to a poly(U) region and prevents snRNP (a step in
alternative splicing
Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be ...
) recruitment to the 5′ splice site.
See also
*
Three prime untranslated region
*
UORF
*
Iron-responsive element-binding protein
*
Iron response element
*
Trans-splicing
*
UTRdb
References
{{Reflist
RNA
Gene expression