HOME

TheInfoList



OR:

The 5′ untranslated region (also known as 5′ UTR, leader sequence, transcript leader, or leader RNA) is the region of a messenger RNA (mRNA) that is directly upstream from the initiation codon. This region is important for the regulation of
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
of a transcript by differing mechanisms in
virus A virus is a wikt:submicroscopic, submicroscopic infectious agent that replicates only inside the living Cell (biology), cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and ...
es, prokaryotes and
eukaryotes Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bact ...
. While called untranslated, the 5′ UTR or a portion of it is sometimes translated into a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respon ...
product. This product can then regulate the translation of the main coding sequence of the mRNA. In many organisms, however, the 5′ UTR is completely untranslated, instead forming a complex
secondary structure Protein secondary structure is the three dimensional form of ''local segments'' of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary struct ...
to regulate translation. The 5′ UTR has been found to interact with proteins relating to metabolism, and within the 5′ UTR. In addition, this region has been involved in transcription regulation, such as the
sex-lethal Sex-lethal (''Sxl'') is a gene found in Dipteran insects, named for its mutation phenotype in ''Drosophila melanogaster'' (). It is most closely related to the ELAV/HUD subfamily of splicing factors. In fruit flies, this protein participates in a ...
gene in ''
Drosophila ''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many s ...
''. Regulatory elements within 5′ UTRs have also been linked to mRNA export.


General structure


Length

The 5′ UTR begins at the transcription start site and ends one
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecul ...
(nt) before the initiation sequence (usually AUG) of the coding region. In prokaryotes, the length of the 5′ UTR tends to be 3–10 nucleotides long, while in eukaryotes it tends to be anywhere from 100 to several thousand nucleotides long. For example, the ''ste11'' transcript in ''
Schizosaccharomyces pombe ''Schizosaccharomyces pombe'', also called "fission yeast", is a species of yeast used in traditional brewing and as a model organism in molecular and cell biology. It is a unicellular eukaryote, whose cells are rod-shaped. Cells typically measur ...
'' has a 2273 nucleotide 5′ UTR while the ''lac'' operon in ''
Escherichia coli ''Escherichia coli'' (),Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. also known as ''E. coli'' (), is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus '' Esc ...
'' only has seven nucleotides in its 5′ UTR. The differing sizes are likely due to the complexity of the eukaryotic regulation which the 5′ UTR holds as well as the larger pre-initiation complex that must form to begin translation. The 5′ UTR can also be completely missing, in the case of leaderless mRNAs.
Ribosomes Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to ...
of all three domains of life accept and translate such mRNAs. Such sequences are naturally found in all three domains of life. Humans have many pressure-related genes under a 2–3 nucleotide leader. Mammals also have other types of ultra-short leaders like the TISU sequence.


Elements

The elements of a eukaryotic and prokaryotic 5′ UTR differ greatly. The prokaryotic 5′ UTR contains a ribosome binding site (RBS), also known as the Shine–Dalgarno sequence (AGGAGGU), which is usually 3–10
base pairs A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both D ...
upstream from the initiation codon. In contrast, the eukaryotic 5′ UTR contains the Kozak consensus sequence (ACCAUGG), which contains the initiation codon. The eukaryotic 5′ UTR also contains ''cis''-acting regulatory elements called upstream open reading frames (uORFs) and upstream AUGs (uAUGs) and termination codons, which have a great impact on the regulation of translation ( see below). Unlike prokaryotes, 5′ UTRs can harbor introns in eukaryotes. In humans, ~35% of all genes harbor introns within the 5′ UTR.


Secondary structure

As the 5′ UTR has high GC content, secondary structures often occur within it.
Hairpin loop Stem-loop intramolecular base pairing is a pattern that can occur in single-stranded RNA. The structure is also known as a hairpin or hairpin loop. It occurs when two regions of the same strand, usually complementary in nucleotide sequence whe ...
s are one such secondary structure that can be located within the 5′ UTR. These secondary structures also impact the regulation of
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
.


Role in translational regulation


Prokaryotes

In
bacteria Bacteria (; singular: bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were am ...
, the initiation of translation occurs when IF-3, along with the
30S ribosomal subunit The prokaryotic small ribosomal subunit, or 30 S subunit, is the smaller subunit of the 70S ribosome found in prokaryotes. It is a complex of the 16S ribosomal RNA (rRNA) and 19 proteins. This complex is implicated in the binding of transfer RN ...
, bind to the Shine–Dalgarno (SD) sequence of the 5′ UTR. This then recruits many other proteins, such as the
50S ribosomal subunit 50 S is the larger subunit of the 70S ribosome of prokaryotes, i.e. bacteria and archaea. It is the site of inhibition for antibiotics such as macrolides, chloramphenicol, clindamycin, and the pleuromutilins. It includes the 5S ribosomal RNA a ...
, which allows for translation to begin. Each of these steps regulates the initiation of translation. Initiation in Archaea is less understood. SD sequences are much rarer, and the initiation factors have more in common with eukaryotic ones. There is no homolog of bacterial IF3. Some mRNAs are leaderless. In both domains, genes without Shine–Dalgarno sequences are also translated in a less understood manner. A requirement seems to be a lack of secondary structure near the initiation codon.


Eukaryotes


Pre-initiation complex regulation

The regulation of translation in eukaryotes is more complex than in prokaryotes. Initially, the eIF4F complex is recruited to the
5′ cap In molecular biology, the five-prime cap (5′ cap) is a specially altered nucleotide on the 5′ end of some primary transcripts such as precursor messenger RNA. This process, known as mRNA capping, is highly regulated and vital in the creation o ...
, which in turn recruits the ribosomal complex to the 5′ UTR. Both
eIF4E Eukaryotic translation initiation factor 4E, also known as eIF4E, is a protein that in humans is encoded by the ''EIF4E'' gene. Structure and function Most eukaryotic cellular mRNAs are blocked at their 5'-ends with the 7-methyl- guanosine ...
and eIF4G bind the 5′ UTR, which limits the rate at which translational initiation can occur. However, this is not the only regulatory step of
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
that involves the 5′ UTR.
RNA-binding protein RNA-binding proteins (often abbreviated as RBPs) are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes. RBPs contain various structural motifs, such as RNA recognition moti ...
s sometimes serve to prevent the pre-initiation complex from forming. An example is regulation of the ''msl2'' gene. The protein SXL attaches to an intron segment located within the 5′ UTR segment of the primary transcript, which leads to the inclusion of the intron after processing. This sequence allows the recruitment of proteins that bind simultaneously to both the 5′ and
3′ UTR In molecular genetics, the three prime untranslated region (3′-UTR) is the section of messenger RNA (mRNA) that immediately follows the translation termination codon. The 3′-UTR often contains regulatory regions that post-transcriptionally ...
, not allowing translation proteins to assemble. However, it has also been noted that SXL can also repress translation of RNAs that do not contain a poly(A) tail, or more generally, 3′ UTR.


Closed-loop regulation

Another important regulator of translation is the interaction between 3′ UTR and the 5′ UTR. The closed-loop structure inhibits translation. This has been observed in '' Xenopus laevis'', in which eIF4E bound to the 5′ cap interacts with Maskin bound to CPEB on the 3′ UTR, creating translationally inactive transcripts. This translational inhibition is lifted once CPEB is phosphorylated, displacing the Maskin binding site, allowing for the
polymerization In polymer chemistry, polymerization (American English), or polymerisation (British English), is a process of reacting monomer molecules together in a chemical reaction to form polymer chains or three-dimensional networks. There are many fo ...
of the PolyA tail, which can recruit the translational machinery by means of PABP. However, it is important to note that this mechanism has been under great scrutiny.


Ferritin regulation

Iron levels in cells are maintained by translation regulation of many proteins involved in iron storage and metabolism. The 5′ UTR has the ability to form a hairpin loop secondary structure (known as the iron response element or IRE) that is recognized by iron-regulatory proteins (IRP1 and IRP2). In low levels of iron, the ORF of the target mRNA is blocked as a result of steric hindrance from the binding of IRP1 and IRP2 to the IRE. When iron is high, then the two iron-regulatory proteins do not bind as strongly and allow proteins to be expressed that have a role in iron concentration control. This function has gained some interest after it was revealed that the translation of amyloid precursor protein may be disrupted due to a single-nucleotide polymorphism to the IRE found in the 5′ UTR of its
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the ...
, leading to a spontaneous increased risk of Alzheimer's disease.


uORFs and reinitiation

Another form of translational regulation in eukaryotes comes from unique elements on the 5′ UTR called upstream open reading frames (uORF). These elements are fairly common, occurring in 35–49% of all human genes. A uORF is a coding sequence located in the 5′ UTR located upstream of the coding sequences initiation site. These uORFs contain their own initiation codon, known as an upstream AUG (uAUG). This
codon The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
can be scanned for by ribosomes and then translated to create a product, which can regulate the translation of the main protein coding sequence or other uORFs that may exist on the same transcript. The translation of the protein within the main ORF after a uORF sequence has been translated is known as reinitiation. The process of reinitiation is known to reduce the translation of the ORF protein. Control of protein regulation is determined by the distance between the uORF and the first codon in the main ORF. A uORF has been found to increase reinitiation with the longer distance between its uAUG and the start codon of the main ORF, which indicates that the ribosome needs to reacquire translation factors before it can carry out translation of the main protein. For example, '' ATF4'' regulation is performed by two uORFs further upstream, named uORF1 and uORF2, which contain three amino acids and fifty-nine amino acids, respectively. The location of uORF2 overlaps with the ''ATF4'' ORF. During normal conditions, the uORF1 is translated, and then translation of uORF2 occurs only after eIF2-TC has been reacquired. Translation of the uORF2 requires that the ribosomes pass by the ''ATF4'' ORF, whose start codon is located within uORF2. This leads to its repression. However, during stress conditions, the
40S The eukaryotic small ribosomal subunit (40S) is the smaller subunit of the eukaryotic 80S ribosomes, with the other major component being the large ribosomal subunit (60S). The "40S" and "60S" names originate from the convention that ribosomal pa ...
ribosome will bypass uORF2 because of a decrease in concentration of eIF2-TC, which means the ribosome does not acquire one in time to translate uORF2. Instead, ''ATF4'' is translated.


= Other mechanisms

= In addition to reinitiation, uORFs contribute to translation initiation based on: * The nucleotides of an uORF may code for a codon that leads to a highly structured mRNA, causing the ribosome to stall. * cis- and trans- regulation on translation of the main protein coding sequence. * Interactions with IRES sites.


Internal ribosome entry sites and viruses

Viral Viral means "relating to viruses" (small infectious agents). Viral may also refer to: Viral behavior, or virality Memetic behavior likened that of a virus, for example: * Viral marketing, the use of existing social networks to spread a marke ...
(as well as some eukaryotic) 5′ UTRs contain
internal ribosome entry site An internal ribosome entry site, abbreviated IRES, is an RNA element that allows for translation initiation in a cap-independent manner, as part of the greater process of protein synthesis. In eukaryotic translation, initiation typically occurs at ...
s, which is a cap-independent method of translational activation. Instead of building up a complex at the 5′ cap, the IRES allows for direct binding of the ribosomal complexes to the transcript to begin translation. The IRES enables the viral transcript to translate more efficiently due to the lack of needing a preinitation complex, allowing the virus to replicate quickly.


Role in transcriptional regulation


''msl-2'' transcript

Transcription of the '' msl-2'' transcript is regulated by multiple binding sites for fly '' Sxl'' at the 5′ UTR. In particular, these poly-
uracil Uracil () (symbol U or Ura) is one of the four nucleobases in the nucleic acid RNA. The others are adenine (A), cytosine (C), and guanine (G). In RNA, uracil binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced ...
sites are located close to a small intron that is spliced in males, but kept in females through splicing inhibition. This splicing inhibition is maintained by ''Sxl''. When present, ''Sxl'' will repress the translation of ''msl2'' by increasing translation of a start codon located in a uORF in the 5′ UTR ( see above for more information on uORFs). Also, ''Sxl'' outcompetes TIA-1 to a poly(U) region and prevents snRNP (a step in alternative splicing) recruitment to the 5′ splice site.


See also

* Three prime untranslated region * UORF * Iron-responsive element-binding protein * Iron response element * Trans-splicing * UTRdb


References

{{Reflist RNA Gene expression