HOME

TheInfoList



OR:

The 5′ untranslated region (also known as 5′ UTR, leader sequence, transcript leader, or leader RNA) is the region of a
messenger RNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the ...
(mRNA) that is directly
upstream Upstream may refer to: * Upstream (bioprocess) * ''Upstream'' (film), a 1927 film by John Ford * Upstream (networking) * ''Upstream'' (newspaper), a newspaper covering the oil and gas industry * Upstream (petroleum industry) * Upstream (software ...
from the
initiation codon The start codon is the first codon of a messenger RNA (mRNA) transcript translated by a ribosome. The start codon always codes for methionine in eukaryotes and Archaea and a N-formylmethionine (fMet) in bacteria, mitochondria and plastids. The mo ...
. This region is important for the regulation of
translation Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transla ...
of a transcript by differing mechanisms in
virus A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Since Dmitri Ivanovsk ...
es,
prokaryotes A prokaryote () is a single-celled organism that lacks a nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Greek πρό (, 'before') and κάρυον (, 'nut' or 'kernel').Campbell, N. "Biology:Concepts & Con ...
and
eukaryotes Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacter ...
. While called untranslated, the 5′ UTR or a portion of it is sometimes translated into a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
product. This product can then regulate the translation of the main
coding sequence The coding region of a gene, also known as the coding sequence (CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to ...
of the mRNA. In many organisms, however, the 5′ UTR is completely untranslated, instead forming a complex
secondary structure Protein secondary structure is the three dimensional form of ''local segments'' of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary struct ...
to regulate translation. The 5′ UTR has been found to interact with proteins relating to metabolism, and within the 5′ UTR. In addition, this region has been involved in transcription regulation, such as the sex-lethal gene in ''
Drosophila ''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many speci ...
''. Regulatory elements within 5′ UTRs have also been linked to mRNA export.


General structure


Length

The 5′ UTR begins at the
transcription start site Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules calle ...
and ends one
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecu ...
(nt) before the
initiation sequence Initiation is a rite of passage marking entrance or acceptance into a group or society. It could also be a formal admission to adulthood in a community or one of its formal components. In an extended sense, it can also signify a transformation ...
(usually AUG) of the coding region. In prokaryotes, the length of the 5′ UTR tends to be 3–10 nucleotides long, while in eukaryotes it tends to be anywhere from 100 to several thousand nucleotides long. For example, the ''ste11'' transcript in '' Schizosaccharomyces pombe'' has a 2273 nucleotide 5′ UTR while the ''lac'' operon in ''
Escherichia coli ''Escherichia coli'' (),Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. also known as ''E. coli'' (), is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus '' Esc ...
'' only has seven nucleotides in its 5′ UTR. The differing sizes are likely due to the complexity of the eukaryotic regulation which the 5′ UTR holds as well as the larger pre-initiation complex that must form to begin translation. The 5′ UTR can also be completely missing, in the case of leaderless mRNAs. Ribosomes of all three domains of life accept and translate such mRNAs. Such sequences are naturally found in all three domains of life. Humans have many pressure-related genes under a 2–3 nucleotide leader. Mammals also have other types of ultra-short leaders like the TISU sequence.


Elements

The elements of a eukaryotic and prokaryotic 5′ UTR differ greatly. The prokaryotic 5′ UTR contains a
ribosome binding site A ribosome binding site, or ribosomal binding site (RBS), is a sequence of nucleotides upstream of the start codon of an mRNA transcript that is responsible for the recruitment of a ribosome during the initiation of translation. Mostly, RBS refers t ...
(RBS), also known as the
Shine–Dalgarno sequence The Shine–Dalgarno (SD) sequence is a ribosomal binding site in bacterial and archaeal messenger RNA, generally located around 8 bases upstream of the start codon AUG. The RNA sequence helps recruit the ribosome to the messenger RNA (mRNA) to in ...
(AGGAGGU), which is usually 3–10
base pairs A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DN ...
upstream from the initiation codon. In contrast, the eukaryotic 5′ UTR contains the
Kozak consensus sequence The Kozak consensus sequence (Kozak consensus or Kozak sequence) is a nucleic acid motif that functions as the protein translation initiation site in most eukaryotic mRNA transcripts. Regarded as the optimum sequence for initiating translation in ...
(ACCAUGG), which contains the initiation codon. The eukaryotic 5′ UTR also contains ''cis''-acting regulatory elements called upstream open reading frames (uORFs) and upstream AUGs (uAUGs) and termination codons, which have a great impact on the regulation of translation ( see below). Unlike prokaryotes, 5′ UTRs can harbor
introns An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e. a region inside a gene."The notion of the cistron .e., gene ...
in eukaryotes. In humans, ~35% of all genes harbor introns within the 5′ UTR.


Secondary structure

As the 5′ UTR has high GC content,
secondary structures Secondary may refer to: Science and nature * Secondary emission, of particles ** Secondary electrons, electrons generated as ionization products * The secondary winding, or the electrical or electronic circuit connected to the secondary winding i ...
often occur within it. Hairpin loops are one such secondary structure that can be located within the 5′ UTR. These secondary structures also impact the regulation of
translation Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transla ...
.


Role in translational regulation


Prokaryotes

In
bacteria Bacteria (; singular: bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were am ...
, the initiation of translation occurs when IF-3, along with the 30S ribosomal subunit, bind to the Shine–Dalgarno (SD) sequence of the 5′ UTR. This then recruits many other proteins, such as the 50S ribosomal subunit, which allows for translation to begin. Each of these steps regulates the initiation of translation. Initiation in
Archaea Archaea ( ; singular archaeon ) is a domain of single-celled organisms. These microorganisms lack cell nuclei and are therefore prokaryotes. Archaea were initially classified as bacteria, receiving the name archaebacteria (in the Archaeba ...
is less understood. SD sequences are much rarer, and the initiation factors have more in common with eukaryotic ones. There is no homolog of bacterial IF3. Some mRNAs are leaderless. In both domains, genes without Shine–Dalgarno sequences are also translated in a less understood manner. A requirement seems to be a lack of secondary structure near the initiation codon.


Eukaryotes


Pre-initiation complex regulation

The regulation of translation in eukaryotes is more complex than in prokaryotes. Initially, the eIF4F complex is recruited to the 5′ cap, which in turn recruits the ribosomal complex to the 5′ UTR. Both
eIF4E Eukaryotic translation initiation factor 4E, also known as eIF4E, is a protein that in humans is encoded by the ''EIF4E'' gene. Structure and function Most eukaryotic cellular mRNAs are blocked at their 5'-ends with the 7-methyl-guanosine f ...
and eIF4G bind the 5′ UTR, which limits the rate at which translational initiation can occur. However, this is not the only regulatory step of
translation Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transla ...
that involves the 5′ UTR.
RNA-binding protein RNA-binding proteins (often abbreviated as RBPs) are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes. RBPs contain various structural motifs, such as RNA recognition motif ...
s sometimes serve to prevent the pre-initiation complex from forming. An example is regulation of the ''msl2'' gene. The protein SXL attaches to an intron segment located within the 5′ UTR segment of the primary transcript, which leads to the inclusion of the intron after processing. This sequence allows the recruitment of proteins that bind simultaneously to both the 5′ and 3′ UTR, not allowing translation proteins to assemble. However, it has also been noted that SXL can also repress translation of RNAs that do not contain a poly(A) tail, or more generally, 3′ UTR.


Closed-loop regulation

Another important regulator of translation is the interaction between 3′ UTR and the 5′ UTR. The closed-loop structure inhibits translation. This has been observed in ''
Xenopus laevis The African clawed frog (''Xenopus laevis'', also known as the xenopus, African clawed toad, African claw-toed frog or the ''platanna'') is a species of African aquatic frog of the family Pipidae. Its name is derived from the three short claws o ...
'', in which eIF4E bound to the 5′ cap interacts with Maskin bound to
CPEB CPEB, or cytoplasmic polyadenylation element binding protein, is a highly conserved RNA-binding protein that promotes the elongation of the polyadenine tail of messenger RNA. CPEB most commonly activates the target RNA for translation, but c ...
on the 3′ UTR, creating translationally inactive transcripts. This translational inhibition is lifted once CPEB is
phosphorylated In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, wh ...
, displacing the Maskin binding site, allowing for the
polymerization In polymer chemistry, polymerization (American English), or polymerisation (British English), is a process of reacting monomer molecules together in a chemical reaction to form polymer chains or three-dimensional networks. There are many f ...
of the PolyA tail, which can recruit the translational machinery by means of PABP. However, it is important to note that this mechanism has been under great scrutiny.


Ferritin regulation

Iron levels in cells are maintained by translation regulation of many proteins involved in iron storage and metabolism. The 5′ UTR has the ability to form a hairpin loop secondary structure (known as the iron response element or IRE) that is recognized by iron-regulatory proteins (IRP1 and IRP2). In low levels of iron, the ORF of the target mRNA is blocked as a result of
steric hindrance Steric effects arise from the spatial arrangement of atoms. When atoms come close together there is a rise in the energy of the molecule. Steric effects are nonbonding interactions that influence the shape ( conformation) and reactivity of ions ...
from the binding of IRP1 and IRP2 to the IRE. When iron is high, then the two iron-regulatory proteins do not bind as strongly and allow proteins to be expressed that have a role in iron concentration control. This function has gained some interest after it was revealed that the translation of
amyloid precursor protein Amyloid-beta precursor protein (APP) is an integral membrane protein expressed in many tissues and concentrated in the synapses of neurons. It functions as a cell surface receptor and has been implicated as a regulator of synapse forma ...
may be disrupted due to a single-nucleotide polymorphism to the IRE found in the 5′ UTR of its
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the ...
, leading to a spontaneous increased risk of
Alzheimer's disease Alzheimer's disease (AD) is a neurodegenerative disease that usually starts slowly and progressively worsens. It is the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in remembering recent events. As ...
.


uORFs and reinitiation

Another form of translational regulation in eukaryotes comes from unique elements on the 5′ UTR called upstream open reading frames (uORF). These elements are fairly common, occurring in 35–49% of all human genes. A uORF is a coding sequence located in the 5′ UTR located upstream of the coding sequences initiation site. These uORFs contain their own initiation codon, known as an upstream AUG (uAUG). This
codon The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
can be scanned for by ribosomes and then translated to create a product, which can regulate the translation of the main protein coding sequence or other uORFs that may exist on the same transcript. The translation of the protein within the main ORF after a uORF sequence has been translated is known as reinitiation. The process of reinitiation is known to reduce the translation of the ORF protein. Control of protein regulation is determined by the distance between the uORF and the first codon in the main ORF. A uORF has been found to increase reinitiation with the longer distance between its uAUG and the start codon of the main ORF, which indicates that the ribosome needs to reacquire translation factors before it can carry out translation of the main protein. For example, ''
ATF4 Activating transcription factor 4 (tax-responsive enhancer element B67), also known as ATF4, is a protein that in humans is encoded by the ''ATF4'' gene. Function This gene encodes a transcription factor that was originally identified as a w ...
'' regulation is performed by two uORFs further upstream, named uORF1 and uORF2, which contain three amino acids and fifty-nine amino acids, respectively. The location of uORF2 overlaps with the ''ATF4'' ORF. During normal conditions, the uORF1 is translated, and then translation of uORF2 occurs only after eIF2-TC has been reacquired. Translation of the uORF2 requires that the ribosomes pass by the ''ATF4'' ORF, whose start codon is located within uORF2. This leads to its repression. However, during stress conditions, the 40S ribosome will bypass uORF2 because of a decrease in concentration of eIF2-TC, which means the ribosome does not acquire one in time to translate uORF2. Instead, ''ATF4'' is translated.


= Other mechanisms

= In addition to reinitiation, uORFs contribute to translation initiation based on: * The nucleotides of an uORF may code for a codon that leads to a highly structured mRNA, causing the ribosome to stall. * cis- and trans- regulation on translation of the main protein coding sequence. * Interactions with IRES sites.


Internal ribosome entry sites and viruses

Viral (as well as some eukaryotic) 5′ UTRs contain
internal ribosome entry site An internal ribosome entry site, abbreviated IRES, is an RNA element that allows for translation initiation in a cap-independent manner, as part of the greater process of protein synthesis. In eukaryotic translation, initiation typically occurs at ...
s, which is a cap-independent method of translational activation. Instead of building up a complex at the 5′ cap, the IRES allows for direct binding of the ribosomal complexes to the transcript to begin translation. The IRES enables the viral transcript to translate more efficiently due to the lack of needing a preinitation complex, allowing the virus to replicate quickly.


Role in transcriptional regulation


''msl-2'' transcript

Transcription of the '' msl-2'' transcript is regulated by multiple binding sites for fly '' Sxl'' at the 5′ UTR. In particular, these poly-
uracil Uracil () (symbol U or Ura) is one of the four nucleobases in the nucleic acid RNA. The others are adenine (A), cytosine (C), and guanine (G). In RNA, uracil binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced b ...
sites are located close to a small intron that is spliced in males, but kept in females through splicing inhibition. This splicing inhibition is maintained by ''Sxl''. When present, ''Sxl'' will repress the translation of ''msl2'' by increasing translation of a start codon located in a uORF in the 5′ UTR ( see above for more information on uORFs). Also, ''Sxl'' outcompetes TIA-1 to a poly(U) region and prevents snRNP (a step in
alternative splicing Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be i ...
) recruitment to the 5′ splice site.


See also

*
Three prime untranslated region In molecular genetics, the three prime untranslated region (3′-UTR) is the section of messenger RNA (mRNA) that immediately follows the translation termination codon. The 3′-UTR often contains regulatory regions that post-transcriptionally ...
* UORF * Iron-responsive element-binding protein * Iron response element * Trans-splicing * UTRdb


References

{{Reflist RNA Gene expression