Transcriptional modification or co-transcriptional modification is a set of biological processes common to most
eukaryotic
Eukaryotes () are organisms whose Cell (biology), cells have a cell nucleus, nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the ...
cells by which an
RNA primary transcript
A primary transcript is the single-stranded ribonucleic acid ( RNA) product synthesized by transcription of DNA, and processed to yield various mature RNA products such as mRNAs, tRNAs, and rRNAs. The primary transcripts designated to be mRNAs ...
is chemically altered following
transcription from a
gene
In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
to produce a mature, functional RNA molecule that can then leave the
nucleus and perform any of a variety of different functions in the cell. There are many types of post-transcriptional modifications achieved through a diverse class of molecular mechanisms.
One example is the conversion of precursor
messenger RNA transcripts into mature messenger RNA that is subsequently capable of being
translated into
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
. This process includes three major steps that significantly modify the chemical structure of the RNA molecule: the addition of a
5' cap, the addition of a 3'
polyadenylated tail, and
RNA splicing. Such processing is vital for the correct translation of eukaryotic
genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding g ...
s because the initial precursor mRNA produced by transcription often contains both
exons (coding sequences) and
introns (non-coding sequences); splicing removes the introns and links the exons directly, while the cap and tail facilitate the transport of the mRNA to a
ribosome and protect it from molecular degradation.
Post-transcriptional modifications may also occur during the processing of other transcripts which ultimately become
transfer RNA,
ribosomal RNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosom ...
, or any of the other types of RNA used by the cell.
mRNA processing
5' processing
Capping
Capping of the pre-mRNA involves the addition of
7-methylguanosine (m
7G) to the 5' end. To achieve this, the terminal 5' phosphate requires removal, which is done with the aid of a
phosphatase enzyme. The enzyme
guanosyl transferase Guanylyl transferases are enzymes that transfer a guanosine mono phosphate group, usually from GTP to another molecule, releasing pyrophosphate. Many eukaryotic guanylyl transferases are capping enzymes that catalyze the formation of the 5' cap i ...
then catalyses the reaction, which produces the
diphosphate
In chemistry, pyrophosphates are phosphorus oxyanions that contain two phosphorus atoms in a P–O–P linkage. A number of pyrophosphate salts exist, such as disodium pyrophosphate (Na2H2P2O7) and tetrasodium pyrophosphate (Na4P2O7), among other ...
5' end. The diphosphate 5' end then attacks the alpha phosphorus atom of a
GTP molecule in order to add the
guanine
Guanine () ( symbol G or Gua) is one of the four main nucleobases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, and thymine (uracil in RNA). In DNA, guanine is paired with cytosine. The guanine nucleoside is c ...
residue in a 5'5' triphosphate link. The enzyme
(guanine-''N''7-)-methyltransferase ("cap MTase") transfers a methyl group from
S-adenosyl methionine to the guanine ring. This type of cap, with just the (m
7G) in position is called a cap 0 structure. The
ribose of the adjacent
nucleotide
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecule ...
may also be methylated to give a cap 1. Methylation of nucleotides downstream of the RNA molecule produce cap 2, cap 3 structures and so on. In these cases the methyl groups are added to the 2' OH groups of the ribose sugar.
The cap protects the 5' end of the primary RNA transcript from attack by
ribonucleases that have specificity to the 3'5'
phosphodiester bonds.
3' processing
Cleavage and polyadenylation
The pre-mRNA processing at the 3' end of the RNA molecule involves cleavage of its 3' end and then the addition of about 250
adenine
Adenine () ( symbol A or Ade) is a nucleobase (a purine derivative). It is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The three others are guanine, cytosine and thymine. Its deri ...
residues to form a
poly(A) tail
Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eu ...
. The cleavage and adenylation reactions occur primarily if a
polyadenylation signal sequence (5'- AAUAAA-3') is located near the 3' end of the pre-mRNA molecule, which is followed by another sequence, which is usually (5'-CA-3') and is the site of cleavage. A GU-rich sequence is also usually present further downstream on the pre-mRNA molecule. More recently, it has been demonstrated that alternate signal sequences such as UGUA upstream off the cleavage site can also direct cleavage and polyadenylation in the absence of the AAUAAA signal.
It is important to understand that these two signals are not mutually independent and often coexist. After the synthesis of the sequence elements, several multi-subunit
proteins are transferred to the RNA molecule. The transfer of these sequence specific binding proteins
cleavage and polyadenylation specificity factor
Cleavage and polyadenylation specificity factor (CPSF) is involved in the cleavage of the 3' signaling region from a newly synthesized pre-messenger RNA (pre-mRNA) molecule in the process of gene transcription. It is the first protein to bind t ...
(CPSF), Cleavage Factor I (CF I) and
cleavage stimulation factor (CStF) occurs from
RNA Polymerase II
RNA polymerase II (RNAP II and Pol II) is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of eukaryo ...
. The three factors bind to the sequence elements. The AAUAAA signal is directly bound by CPSF. For UGUA dependent processing sites, binding of the multi protein complex is done by Cleavage Factor I (CF I). The resultant protein complex formed contains additional cleavage factors and the enzyme
Polyadenylate Polymerase (PAP). This complex cleaves the RNA between the polyadenylation sequence and the GU-rich sequence at the cleavage site marked by the (5'-CA-3') sequences. Poly(A) polymerase then adds about 200 adenine units to the new 3' end of the RNA molecule using
ATP as a precursor. As the poly(A) tail is synthesized, it binds multiple copies of
poly(A)-binding protein
Poly(A)-binding protein (PAB or PABP) is an RNA-binding protein which triggers the binding of eukaryotic initiation factor 4 complex (eIF4G) directly to the poly(A) tail of mRNA which is 200-250 nucleotides long. The poly(A) tail is located on th ...
, which protects the 3'end from ribonuclease digestion by enzymes including the
CCR4-Not
Carbon Catabolite Repression—Negative On TATA-less, or CCR4-Not, is a multiprotein complex that functions in gene expression. The complex has multiple enzymatic activities as both a poly(A) 3′-5′ exonuclease and a ubiquitin ligase. The com ...
complex.
Introns Splicing
RNA splicing is the process by which
introns, regions of RNA that do not code for proteins, are removed from the pre-mRNA and the remaining
exons connected to re-form a single continuous molecule. Exons are sections of mRNA which become "expressed" or translated into a protein. They are the coding portions of a mRNA molecule.
Although most RNA splicing occurs after the complete synthesis and end-capping of the pre-mRNA, transcripts with many exons can be spliced co-transcriptionally.
The splicing reaction is catalyzed by a large protein complex called the
spliceosome
A spliceosome is a large ribonucleoprotein (RNP) complex found primarily within the nucleus of eukaryotic cells. The spliceosome is assembled from small nuclear RNAs ( snRNA) and numerous proteins. Small nuclear RNA (snRNA) molecules bind to specif ...
assembled from proteins and
small nuclear RNA
Small nuclear RNA (snRNA) is a class of small RNA molecules that are found within the splicing speckles and Cajal bodies of the cell nucleus in eukaryotic cells. The length of an average snRNA is approximately 150 nucleotides. They are transcribe ...
molecules that recognize
splice sites in the pre-mRNA sequence. Many pre-mRNAs, including those encoding
antibodies, can be spliced in multiple ways to produce different mature mRNAs that encode different
protein sequences. This process is known as
alternative splicing, and allows production of a large variety of proteins from a limited amount of DNA.
Histone mRNA processing
Histones H2A, H2B, H3 and H4 form the core of a
nucleosome and thus are called
core histones. Processing of core histones is done differently because typical histone mRNA lacks several features of other eukaryotic mRNAs, such as poly(A) tail and introns. Thus, such mRNAs do not undergo splicing and their 3' processing is done independent of most cleavage and polyadenylation factors. Core histone mRNAs have a special
stem-loop structure at 3-prime end that is recognized by a
stem–loop binding protein and a downstream sequence, called histone downstream element (HDE) that recruits
U7 snRNA.
Cleavage and polyadenylation specificity factor 73 cuts mRNA between stem-loop and HDE
Histone variants, such as
H2A.Z or H3.3, however, have introns and are processed as normal mRNAs including splicing and polyadenylation.
See also
*
Post-translational modification
Post-translational modification (PTM) is the covalent and generally enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by ribos ...
*
RNA editing
*
RNA-Seq
References
Further reading
*
*
*
*
*
*
{{MolBioGeneExp
Cell biology
Molecular biology
Gene expression
RNA