Transcriptional modification or co-transcriptional modification is a set of biological processes common to most
eukaryotic
Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
cells by which an
RNA primary transcript is chemically altered following
transcription from a
gene
In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
to produce a mature, functional RNA molecule that can then leave the
nucleus
Nucleus ( : nuclei) is a Latin word for the seed inside a fruit. It most often refers to:
* Atomic nucleus, the very dense central region of an atom
*Cell nucleus, a central organelle of a eukaryotic cell, containing most of the cell's DNA
Nucl ...
and perform any of a variety of different functions in the cell. There are many types of post-transcriptional modifications achieved through a diverse class of molecular mechanisms.
One example is the conversion of precursor
messenger RNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.
mRNA is created during the ...
transcripts into mature messenger RNA that is subsequently capable of being
translated
Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transla ...
into
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
. This process includes three major steps that significantly modify the chemical structure of the RNA molecule: the addition of a
5' cap
In molecular biology, the five-prime cap (5′ cap) is a specially altered nucleotide on the 5′ end of some primary transcripts such as precursor messenger RNA. This process, known as mRNA capping, is highly regulated and vital in the creation ...
, the addition of a 3'
polyadenylated tail, and
RNA splicing
RNA splicing is a process in molecular biology where a newly-made precursor messenger RNA (pre-mRNA) transcription (biology), transcript is transformed into a mature messenger RNA (Messenger RNA, mRNA). It works by removing all the introns (non-cod ...
. Such processing is vital for the correct translation of eukaryotic
genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ...
s because the initial precursor mRNA produced by transcription often contains both
exon
An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding sequen ...
s (coding sequences) and
intron
An intron is any Nucleic acid sequence, nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e. a region inside a gene."The notion of ...
s (non-coding sequences); splicing removes the introns and links the exons directly, while the cap and tail facilitate the transport of the mRNA to a
ribosome
Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to fo ...
and protect it from molecular degradation.
Post-transcriptional modifications may also occur during the processing of other transcripts which ultimately become
transfer RNA
Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ...
,
ribosomal RNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from riboso ...
, or any of the other types of RNA used by the cell.
mRNA processing
5' processing
Capping
Capping of the pre-mRNA involves the addition of
7-methylguanosine
7-Methylguanosine (m7G) is a modified purine nucleoside. It is a methylated version of guanosine and when found in human urine, it may be a biomarker of some types of cancer. In the RNAs, 7-methylguanosine have been used to study and examin ...
(m
7G) to the 5' end. To achieve this, the terminal 5' phosphate requires removal, which is done with the aid of a
phosphatase
In biochemistry, a phosphatase is an enzyme that uses water to cleave a phosphoric acid monoester into a phosphate ion and an alcohol. Because a phosphatase enzyme catalyzes the hydrolysis of its substrate, it is a subcategory of hydrolase ...
enzyme. The enzyme
guanosyl transferase Guanylyl transferases are enzymes that transfer a guanosine mono phosphate group, usually from GTP to another molecule, releasing pyrophosphate. Many eukaryotic guanylyl transferases are capping enzymes that catalyze the formation of the 5' cap i ...
then catalyses the reaction, which produces the
diphosphate 5' end. The diphosphate 5' end then attacks the alpha phosphorus atom of a
GTP molecule in order to add the
guanine
Guanine () ( symbol G or Gua) is one of the four main nucleobases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, and thymine ( uracil in RNA). In DNA, guanine is paired with cytosine. The guanine nucleoside is ...
residue in a 5'5' triphosphate link. The enzyme
(guanine-''N''7-)-methyltransferase ("cap MTase") transfers a methyl group from
S-adenosyl methionine
''S''-Adenosyl methionine (SAM), also known under the commercial names of SAMe, SAM-e, or AdoMet, is a common cosubstrate involved in methyl group transfers, transsulfuration, and aminopropylation. Although these anabolic reactions occur throug ...
to the guanine ring. This type of cap, with just the (m
7G) in position is called a cap 0 structure. The
ribose
Ribose is a simple sugar and carbohydrate with molecular formula C5H10O5 and the linear-form composition H−(C=O)−(CHOH)4−H. The naturally-occurring form, , is a component of the ribonucleotides from which RNA is built, and so this com ...
of the adjacent
nucleotide
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecu ...
may also be methylated to give a cap 1. Methylation of nucleotides downstream of the RNA molecule produce cap 2, cap 3 structures and so on. In these cases the methyl groups are added to the 2' OH groups of the ribose sugar.
The cap protects the 5' end of the primary RNA transcript from attack by
ribonuclease
Ribonuclease (commonly abbreviated RNase) is a type of nuclease that catalyzes the degradation of RNA into smaller components. Ribonucleases can be divided into endoribonucleases and exoribonucleases, and comprise several sub-classes within the ...
s that have specificity to the 3'5'
phosphodiester bond
In chemistry, a phosphodiester bond occurs when exactly two of the hydroxyl groups () in phosphoric acid react with hydroxyl groups on other molecules to form two ester bonds. The "bond" involves this linkage . Discussion of phosphodiesters is ...
s.
3' processing
Cleavage and polyadenylation
The pre-mRNA processing at the 3' end of the RNA molecule involves cleavage of its 3' end and then the addition of about 250
adenine
Adenine () ( symbol A or Ade) is a nucleobase (a purine derivative). It is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The three others are guanine, cytosine and thymine. Its deriv ...
residues to form a
poly(A) tail. The cleavage and adenylation reactions occur primarily if a
polyadenylation signal sequence (5'- AAUAAA-3') is located near the 3' end of the pre-mRNA molecule, which is followed by another sequence, which is usually (5'-CA-3') and is the site of cleavage. A GU-rich sequence is also usually present further downstream on the pre-mRNA molecule. More recently, it has been demonstrated that alternate signal sequences such as UGUA upstream off the cleavage site can also direct cleavage and polyadenylation in the absence of the AAUAAA signal.
It is important to understand that these two signals are not mutually independent and often coexist. After the synthesis of the sequence elements, several multi-subunit
proteins
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
are transferred to the RNA molecule. The transfer of these sequence specific binding proteins
cleavage and polyadenylation specificity factor
Cleavage and polyadenylation specificity factor (CPSF) is involved in the cleavage of the 3' signaling region from a newly synthesized pre-messenger RNA (pre-mRNA) molecule in the process of gene transcription. It is the first protein to bind t ...
(CPSF), Cleavage Factor I (CF I) and
cleavage stimulation factor Cleavage stimulatory factor or cleavage stimulation factor (CstF or CStF) is a heterotrimeric protein, made up of the proteins CSTF1 (55kDa), CSTF2 (64kDa) and CSTF3 (77kDa), totalling about 200 kDa. It is involved in the cleavage of the 3' sign ...
(CStF) occurs from
RNA Polymerase II
RNA polymerase II (RNAP II and Pol II) is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of euka ...
. The three factors bind to the sequence elements. The AAUAAA signal is directly bound by CPSF. For UGUA dependent processing sites, binding of the multi protein complex is done by Cleavage Factor I (CF I). The resultant protein complex formed contains additional cleavage factors and the enzyme
Polyadenylate Polymerase
In enzymology, a polynucleotide adenylyltransferase () is an enzyme that catalysis, catalyzes the chemical reaction
:ATP + RNA-3'OH \rightleftharpoons pyrophosphate + RNApA-3'OH
Thus, the two substrate (biochemistry), substrates of this enzyme ...
(PAP). This complex cleaves the RNA between the polyadenylation sequence and the GU-rich sequence at the cleavage site marked by the (5'-CA-3') sequences. Poly(A) polymerase then adds about 200 adenine units to the new 3' end of the RNA molecule using
ATP as a precursor. As the poly(A) tail is synthesized, it binds multiple copies of
poly(A)-binding protein
Poly(A)-binding protein (PAB or PABP) is an RNA-binding protein which triggers the binding of eukaryotic initiation factor 4 complex (eIF4G) directly to the poly(A) tail of mRNA which is 200-250 nucleotides long. The poly(A) tail is located on the ...
, which protects the 3'end from ribonuclease digestion by enzymes including the
CCR4-Not
Carbon Catabolite Repression—Negative On TATA-less, or CCR4-Not, is a multiprotein complex that functions in gene expression. The complex has multiple enzymatic activities as both a poly(A) 3′-5′ exonuclease and a ubiquitin ligase. The com ...
complex.
Introns Splicing
RNA splicing is the process by which
intron
An intron is any Nucleic acid sequence, nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e. a region inside a gene."The notion of ...
s, regions of RNA that do not code for proteins, are removed from the pre-mRNA and the remaining
exon
An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding sequen ...
s connected to re-form a single continuous molecule. Exons are sections of mRNA which become "expressed" or translated into a protein. They are the coding portions of a mRNA molecule.
Although most RNA splicing occurs after the complete synthesis and end-capping of the pre-mRNA, transcripts with many exons can be spliced co-transcriptionally.
The splicing reaction is catalyzed by a large protein complex called the
spliceosome
A spliceosome is a large ribonucleoprotein (RNP) complex found primarily within the nucleus of eukaryotic cells. The spliceosome is assembled from small nuclear RNAs (snRNA) and numerous proteins. Small nuclear RNA (snRNA) molecules bind to specifi ...
assembled from proteins and
small nuclear RNA
Small nuclear RNA (snRNA) is a class of small RNA molecules that are found within the splicing speckles and Cajal bodies of the cell nucleus in eukaryotic cells. The length of an average snRNA is approximately 150 nucleotides. They are transcribe ...
molecules that recognize
splice site
RNA splicing is a process in molecular biology where a newly-made precursor messenger RNA (pre-mRNA) transcript is transformed into a mature messenger RNA (mRNA). It works by removing all the introns (non-coding regions of RNA) and ''splicing'' ba ...
s in the pre-mRNA sequence. Many pre-mRNAs, including those encoding
antibodies
An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the ...
, can be spliced in multiple ways to produce different mature mRNAs that encode different
protein sequences
Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthes ...
. This process is known as
alternative splicing
Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be ...
, and allows production of a large variety of proteins from a limited amount of DNA.
Histone mRNA processing
Histones H2A, H2B, H3 and H4 form the core of a
nucleosome
A nucleosome is the basic structural unit of DNA packaging in eukaryotes. The structure of a nucleosome consists of a segment of DNA wound around eight histone proteins and resembles thread wrapped around a spool. The nucleosome is the fundamen ...
and thus are called
core histones
In biology, histones are highly Base (chemistry), basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosom ...
. Processing of core histones is done differently because typical histone mRNA lacks several features of other eukaryotic mRNAs, such as poly(A) tail and introns. Thus, such mRNAs do not undergo splicing and their 3' processing is done independent of most cleavage and polyadenylation factors. Core histone mRNAs have a special
stem-loop
Stem-loop intramolecular base pairing is a pattern that can occur in single-stranded RNA. The structure is also known as a hairpin or hairpin loop. It occurs when two regions of the same strand, usually complementary in nucleotide sequence when ...
structure at 3-prime end that is recognized by a
stem–loop binding protein and a downstream sequence, called histone downstream element (HDE) that recruits
U7 snRNA.
Cleavage and polyadenylation specificity factor 73 cuts mRNA between stem-loop and HDE
Histone variants, such as
H2A.Z or H3.3, however, have introns and are processed as normal mRNAs including splicing and polyadenylation.
See also
*
Post-translational modification
Post-translational modification (PTM) is the covalent and generally enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by ribosome ...
*
RNA editing
RNA editing (also RNA modification) is a molecular process through which some cells can make discrete changes to specific nucleotide sequences within an RNA molecule after it has been generated by RNA polymerase. It occurs in all living organisms ...
*
RNA-Seq
RNA-Seq (named as an abbreviation of RNA sequencing) is a sequencing technique which uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample at a given moment, analyzing the continuously changing c ...
References
Further reading
*
*
*
*
*
*
{{MolBioGeneExp
Cell biology
Molecular biology
Gene expression
RNA