In
molecular biology
Molecular biology is the branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions. The study of chemical and physi ...
, Small nucleolar RNAs (snoRNAs) are a class of small
RNA
Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydra ...
molecules that primarily guide chemical modifications of other RNAs, mainly
ribosomal RNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal ...
s,
transfer RNA
Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ac ...
s and
small nuclear RNA
Small nuclear RNA (snRNA) is a class of small RNA molecules that are found within the splicing speckles and Cajal bodies of the cell nucleus in eukaryotic cells. The length of an average snRNA is approximately 150 nucleotides. They are transcribe ...
s. There are two main classes of snoRNA, the C/D box snoRNAs, which are associated with
methylation
In the chemical sciences, methylation denotes the addition of a methyl group on a substrate, or the substitution of an atom (or group) by a methyl group. Methylation is a form of alkylation, with a methyl group replacing a hydrogen atom. These t ...
, and the H/ACA box snoRNAs, which are associated with
pseudouridylation
Pseudouridine (abbreviated by the Greek letter psi- Ψ) is an isomer of the nucleoside uridine in which the uracil is attached via a carbon-carbon instead of a nitrogen-carbon glycosidic bond. (In this configuration, uracil is sometimes referred ...
.
SnoRNAs are commonly referred to as guide RNAs but should not be confused with the
guide RNA
A guide RNA (gRNA) is a piece of RNA that functions as a guide for RNA- or DNA-targeting enzymes, with which it forms complexes. Very often these enzymes will delete, insert or otherwise alter the targeted RNA or DNA. They occur naturally, serv ...
s that direct
RNA editing
RNA editing (also RNA modification) is a molecular process through which some cells can make discrete changes to specific nucleotide sequences within an RNA molecule after it has been generated by RNA polymerase. It occurs in all living organism ...
in
trypanosomes
Trypanosomatida is a group of kinetoplastid excavates distinguished by having only a single flagellum. The name is derived from the Greek ''trypano'' (borer) and ''soma'' (body) because of the corkscrew-like motion of some trypanosomatid species. ...
.
snoRNA guided modifications
After
transcription
Transcription refers to the process of converting sounds (voice, music etc.) into letters or musical notes, or producing a copy of something in another medium, including:
Genetics
* Transcription (biology), the copying of DNA into RNA, the fir ...
, nascent
rRNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosoma ...
molecules (termed pre-rRNA) undergo a series of processing steps to generate the mature rRNA molecule. Prior to cleavage by exo- and endonucleases, the pre-rRNA undergoes a complex pattern of nucleoside modifications. These include methylations and pseudouridylations, guided by snoRNAs.
*Methylation is the attachment or substitution of a
methyl group
In organic chemistry, a methyl group is an alkyl derived from methane, containing one carbon atom bonded to three hydrogen atoms, having chemical formula . In formulas, the group is often abbreviated as Me. This hydrocarbon group occurs in many ...
onto various
substrates. The rRNA of humans contain approximately 115 methyl group modifications. The majority of these are
2′O-ribose-methylations (where the methyl group is attached to the ribose group).
*Pseudouridylation is the conversion (
isomerisation) of the
nucleoside
Nucleosides are glycosylamines that can be thought of as nucleotides without a phosphate group. A nucleoside consists simply of a nucleobase (also termed a nitrogenous base) and a five-carbon sugar (ribose or 2'-deoxyribose) whereas a nucleotide ...
uridine
Uridine (symbol U or Urd) is a glycosylated pyrimidine analog containing uracil attached to a ribose ring (or more specifically, a ribofuranose) via a β-N1-glycosidic bond. The analog is one of the five standard nucleosides which make up nuclei ...
to a different isomeric form
pseudouridine
Pseudouridine (abbreviated by the Greek letter psi- Ψ) is an isomer of the nucleoside uridine in which the uracil is attached via a carbon-carbon instead of a nitrogen-carbon glycosidic bond. (In this configuration, uracil is sometimes referred ...
(Ψ). This modification consists of a 180º rotation of the uridine base around its glycosyl bond to the ribose of the RNA backbone. After this rotation, the nitrogenous base contributes a carbon atom to the glycosyl bond instead of the usual nitrogen atom. The beneficial aspect of this modification is the additional hydrogen-bond donor available on the base. While uridine makes two hydrogen-bonds with its Watson-Crick base pair, adenine, pseudouridine is capable of making three hydrogen bonds. When pseudouridine is base-paired with adenine, it can also make another hydrogen bond, allowing the complexity of the mature rRNA structure to take form. The free hydrogen-bond donor often forms a bond with a base that is distant from itself, creating the tertiary structure that rRNA must have to be functional. Mature human rRNAs contain approximately 95 Ψ modifications.
Each snoRNA molecule acts as a guide for only one (or two) individual modifications in a target RNA. In order to carry out modification, each snoRNA associates with at least four core proteins in an RNA/protein complex referred to as a small nucleolar ribonucleoprotein particle (snoRNP). The proteins associated with each RNA depend on the type of snoRNA molecule (see snoRNA guide families below). The snoRNA molecule contains an
antisense
In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context, ...
element (a stretch of 10–20
nucleotides
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules w ...
), which are base complementary to the sequence surrounding the base (
nucleotide
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules wi ...
) targeted for modification in the pre-RNA molecule. This enables the snoRNP to recognise and bind to the target RNA. Once the snoRNP has bound to the target site, the associated proteins are in the correct physical location to
catalyse
Catalysis () is the process of increasing the rate of a chemical reaction by adding a substance known as a catalyst (). Catalysts are not consumed in the reaction and remain unchanged after it. If the reaction is rapid and the catalyst recyc ...
the chemical modification of the target base.
snoRNA guide families
The two different types of rRNA modification (methylation and pseudouridylation) are directed by two different families of snoRNAs. These families of snoRNAs are referred to as antisense C/D box and H/ACA box snoRNAs based on the presence of conserved sequence motifs in the snoRNA. There are exceptions, but as a general rule C/D box members guide methylation and H/ACA members guide pseudouridylation. The members of each family may vary in biogenesis, structure, and function, but each family is classified by the following generalised characteristics. For more detail, see review.
SnoRNAs are classified under small nuclear RNA in
MeSH
A mesh is a barrier made of connected strands of metal, fiber, or other flexible or ductile materials. A mesh is similar to a web or a net in that it has many attached or woven strands.
Types
* A plastic mesh may be extruded, oriented, ex ...
. The
HGNC
The HUGO Gene Nomenclature Committee (HGNC) is a committee of the Human Genome Organisation (HUGO) that sets the standards for human gene nomenclature. The HGNC approves a ''unique'' and ''meaningful'' name for every known human gene, based on a q ...
, in collaboration wit
snoRNABaseand experts in the field, has approved unique names for human genes that encode snoRNAs.
C/D box
C/D box snoRNAs contain two short conserved sequence motifs, C (RUGAUGA) and D (CUGA), located near the
5′
Directionality, in molecular biology and biochemistry, is the end-to-end chemical orientation of a single strand of nucleic acid. In a single strand of DNA or RNA, the chemical convention of naming carbon atoms in the nucleotide pentose-sugar-ri ...
and
3′
Directionality, in molecular biology and biochemistry, is the end-to-end chemical orientation of a single strand of nucleic acid. In a single strand of DNA or RNA, the chemical convention of naming carbon atoms in the nucleotide pentose-sugar-ri ...
ends of the snoRNA, respectively. Short regions (~ 5 nucleotides) located
upstream
Upstream may refer to:
* Upstream (bioprocess)
* ''Upstream'' (film), a 1927 film by John Ford
* Upstream (networking)
* ''Upstream'' (newspaper), a newspaper covering the oil and gas industry
* Upstream (petroleum industry)
* Upstream (software ...
of the C box and
downstream
Downstream may refer to:
* Downstream (bioprocess)
* Downstream (manufacturing)
* Downstream (networking)
* Downstream (software development)
* Downstream (petroleum industry)
* Upstream and downstream (DNA), determining relative positions on DNA ...
of the D box are usually base complementary and form a stem-box structure, which brings the C and D box motifs into close proximity. This stem-box structure has been shown to be essential for correct snoRNA synthesis and nucleolar localization. Many C/D box snoRNA also contain an additional less-well-conserved copy of the C and D motifs (referred to as C' and D') located in the central portion of the snoRNA molecule. A conserved region of 10–21 nucleotides upstream of the D box is complementary to the methylation site of the target RNA and enables the snoRNA to form an RNA duplex with the RNA. The nucleotide to be modified in the target RNA is usually located at the 5th position upstream from the D box (or D' box). C/D box snoRNAs associate with four evolutionary conserved and essential proteins—
fibrillarin
rRNA 2'-O-methyltransferase fibrillarin is an enzyme that in humans is encoded by the ''FBL'' gene.
Function
This gene product is a component of a nucleolar small nuclear ribonucleoprotein (snRNP) particle thought to participate in the first ...
(Nop1p),
NOP56,
NOP58
Nucleolar protein 58 is a protein that in humans is encoded by the ''NOP58'' gene
In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''b ...
, and
SNU13
NHP2-like protein 1 is a protein that in humans is encoded by the ''SNU13'' gene.
Function
Originally named because of its sequence similarity to the ''Saccharomyces cerevisiae'' NHP2 (non-histone protein 2), this protein appears to be a highly ...
(15.5-kD protein in eukaryotes; its archaeal homolog is L7Ae)—which make up the core C/D box snoRNP.
There exists a eukaryotic C/D box snoRNA (
snoRNA U3
In molecular biology, U3 snoRNA is a non-coding RNA found predominantly in the nucleolus.
U3 has C/D box motifs that technically make it a member of the box C/D class of snoRNAs; however, unlike other C/D box snoRNAs, it has not been shown to dir ...
) that has not been shown to guide 2′-''O''-methylation.
Instead, it functions in rRNA processing by directing pre-rRNA cleavage.
H/ACA box
H/ACA box snoRNAs have a common
secondary structure
Protein secondary structure is the three dimensional conformational isomerism, form of ''local segments'' of proteins. The two most common Protein structure#Secondary structure, secondary structural elements are alpha helix, alpha helices and beta ...
consisting of a two
hairpins and two single-stranded regions termed a hairpin-hinge-hairpin-tail structure.
H/ACA snoRNAs also contain conserved sequence motifs known as H box (consensus ANANNA) and the ACA box (ACA). Both motifs are usually located in the single-stranded regions of the secondary structure. The H motif is located in the hinge and the ACA motif is located in the tail region; 3 nucleotides from the 3′ end of the sequence. The hairpin regions contain internal bulges known as recognition loops in which the antisense guide sequences (bases complementary to the target sequence) are located. These guide sequences essentially mark the location of the uridine on the target rRNA that is going to be modified. This recognition sequence is bipartite (constructed from the two different arms of the loop region) and forms complex
pseudo-knots with the target RNA. H/ACA box snoRNAs associate with four evolutionary conserved and essential proteins—
dyskerin
H/ACA ribonucleoprotein complex subunit 4 is a protein that in humans is encoded by the gene ''DKC1''.
This gene is a member of the snoRNA, H/ACA snoRNPs (small nucleolar ribonucleoproteins) gene family. snoRNPs are involved in various aspects of ...
(Cbf5p),
GAR1,
NHP2, and
NOP10—which make up the core of the H/ACA box snoRNP.
Dyskerin is likely the catalytic component of the ribonucleoprotein (RNP) complex because it possesses several conserved pseudouridine synthase sequences, and is closely related to the pseudouridine synthase that modifies uridine in
tRNA
Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ac ...
. In lower eukaryotic cells such as trypanosomes, similar RNAs exist in the form of single hairpin structure and an AGA box instead of ACA box at the 3′ end of the RNA. Like Trypanosomes, ''Entamoeba histolytica'' has mix population of single hairpin as well as double hairpin H/ACA box snoRNAs. It was reported that there occurred processing of the double hairpin H/ACA box snoRNA to the single hairpin snoRNAs however, unlike trypanosomes, it has a regular ACA motif at 3′ tail.
9/sup>
The RNA component of human telomerase
Telomerase, also called terminal transferase, is a ribonucleoprotein that adds a species-dependent telomere repeat sequence to the 3' end of telomeres. A telomere is a region of repetitive sequences at each end of the chromosomes of most euka ...
(hTERC) contains an H/ACA domain for pre-RNP formation and nucleolar localization of the telomerase RNP itself. The H/ACA snoRNP has been implicated in the rare genetic disease dyskeratosis congenita
Dyskeratosis congenita (DKC), also known as Zinsser-Engman-Cole syndrome, is a rare progressive congenital disorder with a highly variable phenotype. The entity was classically defined by the triad of abnormal skin pigmentation, nail dystrophy, and ...
(DKC) due to its affiliation with human telomerase. Mutations in the protein component of the H/ACA snoRNP result in a reduction in physiological TERC levels. This has been strongly correlated with the pathology behind DKC, which seems to be primarily a disease of poor telomere
A telomere (; ) is a region of repetitive nucleotide sequences associated with specialized proteins at the ends of linear chromosomes. Although there are different architectures, telomeres, in a broad sense, are a widespread genetic feature mos ...
maintenance.
Composite H/ACA and C/D box
An unusual guide snoRNA U85 that functions in both 2′-O-ribose methylation and pseudouridylation of small nuclear RNA
Small nuclear RNA (snRNA) is a class of small RNA molecules that are found within the splicing speckles and Cajal bodies of the cell nucleus in eukaryotic cells. The length of an average snRNA is approximately 150 nucleotides. They are transcribe ...
(snRNA) U5 has been identified. This composite snoRNA contains both C/D and H/ACA box domains and associates with the proteins specific to each class of snoRNA (fibrillarin and Gar1p, respectively). More composite snoRNAs have now been characterised.
These composite snoRNAs have been found to accumulate in a subnuclear organelle called the Cajal body
Cajal bodies (CBs) also coiled bodies, are spherical nuclear bodies of 0.3–1.0 µm in diameter found in the nucleus of proliferative cells like embryonic cells and tumor cells, or metabolically active cells like neurons. CBs are membrane-le ...
and are referred to as small Cajal body-specific RNA
Small Cajal body-specific RNAs (scaRNAs) are a class of small nucleolar RNAs (snoRNAs) that specifically localise to the Cajal body, a nuclear organelle (cellular sub-organelle) involved in the biogenesis of small nuclear ribonucleoproteins (snR ...
s (scaRNAs). This is in contrast to the majority of C/D box or H/ACA box snoRNAs, which localise to the nucleolus. These Cajal body specific RNAs are proposed to be involved in the modification of RNA polymerase II transcribed spliceosomal RNAs U1, U2, U4, U5 and U12. Not all snoRNAs that have been localised to Cajal bodies are composite C/D and H/ACA box snoRNAs.
Orphan snoRNAs
The targets for newly identified snoRNAs are predicted on the basis of sequence complementarity between putative target RNAs and the antisense elements or recognition loops in the snoRNA sequence. However, there are increasing numbers of 'orphan' guides without any known RNA targets, which suggests that there might be more proteins or transcripts involved in rRNA than previously and/or that some snoRNAs have different functions not concerning rRNA. There is evidence that some of these orphan snoRNAs regulate alternatively spliced transcripts. For example, it appears that the C/D box snoRNA SNORD115 regulates the alternative splicing of the serotonin 2C receptor mRNA via a conserved region of complementarity.
Another C/D box snoRNA, SNORD116, that resides in the same cluster as SNORD115 has been predicted to have 23 possible targets within protein coding genes using a bioinformatic
Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combine ...
approach. Of these, a large fraction were found to be alternatively spliced, suggesting a role of SNORD116 in the regulation of alternative splicing.
Target modifications
The precise effect of the methylation and pseudouridylation modifications on the function of the mature RNAs is not yet known. The modifications do not appear to be essential but are known to subtly enhance the RNA folding and interaction with ribosomal proteins. In support of their importance, target site modifications are exclusively located within conserved and functionally important domains of the mature RNA and are commonly conserved among distant eukaryotes. A novel method, Nm-REP-seq, was developed for enriching 2'-O-Methylations guided by C/D snoRNAs by using RNA exoribonuclease (Mycoplasma genitalium RNase R, MgR) and periodate oxidation reactivity to eliminate 2'-hydroxylated (2'-OH) nucleosides.
#2′-O-methylated ribose causes an increase in the 3′-endo conformation
#Pseudouridine (psi/Ψ) adds another option for H-bonding.
#Heavily methylated RNA is protected from hydrolysis. rRNA acts as a ribozyme by catalyzing its own hydrolysis and splicing.
Genomic organisation
SnoRNAs are located diversely in the genome. The majority of vertebrate snoRNA genes are encoded in the intron
An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e. a region inside a gene."The notion of the cistron .e., gene. ...
s of genes encoding proteins involved in ribosome synthesis or translation, and are synthesized by RNA polymerase II
RNA polymerase II (RNAP II and Pol II) is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of eukaryoti ...
. SnoRNAs are also shown to be located in intergenic regions, ORFs of protein coding genes, and UTRs. SnoRNAs can also be transcribed from their own promoters by RNA polymerase II or III.
Imprinted loci
In the human genome, there are at least two examples where C/D box snoRNAs are found in tandem repeats within imprinted loci. These two loci (14q32 on chromosome 14 and 15q11q13 on chromosome 15) have been extensively characterised, and in both regions multiple snoRNAs have been found located within introns in clusters of closely related copies.
In 15q11q13, five different snoRNAs have been identified ( SNORD64, SNORD107, SNORD108, SNORD109 (two copies), SNORD116 (29 copies) and SNORD115 (48 copies). Loss of the 29 copies of SNORD116 (HBII-85) from this region has been identified as a cause of Prader-Willi syndrome whereas gain of additional copies of SNORD115 has been linked to autism
The autism spectrum, often referred to as just autism or in the context of a professional diagnosis autism spectrum disorder (ASD) or autism spectrum condition (ASC), is a neurodevelopmental condition (or conditions) characterized by difficulti ...
.
Region 14q32 contains repeats of two snoRNAs SNORD113 (9 copies) and SNORD114 (31 copies) within the introns of a tissue-specific ncRNA transcript ( MEG8). The 14q32 domain has been shown to share common genomic features with the imprinted 15q11-q13 loci and a possible role for tandem repeats of C/D box snoRNAs in the evolution or mechanism of imprinted loci has been suggested.
Other functions
snoRNAs can function as miRNA
MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. miRN ...
s. It has been shown that human ACA45 is a ''bona fide'' snoRNA that can be processed into a 21-nucleotides
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules w ...
-long mature miRNA by the RNAse III family endoribonuclease dicer
Dicer, also known as endoribonuclease Dicer or helicase with RNase motif, is an enzyme that in humans is encoded by the gene. Being part of the RNase III family, Dicer cleaves double-stranded RNA (dsRNA) and pre-microRNA (pre-miRNA) into short d ...
. This snoRNA product has previously been identified a
mmu-miR-1839
and was shown to be processed independently from the other miRNA-generating endoribonuclease
An endoribonuclease is a ribonuclease endonuclease. It cleaves either single-stranded or double-stranded RNA, depending on the enzyme. Example includes both single proteins such as RNase III, RNase A, RNase T1, RNase T2 and RNase H and also com ...
drosha
Drosha is a Class 2 ribonuclease III enzyme that in humans is encoded by the ''DROSHA'' (formerly ''RNASEN'') gene. It is the primary nuclease that executes the initiation step of miRNA processing in the nucleus. It works closely with DGCR8 and ...
. Bioinformatical analyses have revealed that putatively snoRNA-derived, miRNA-like fragments occur in different organisms.
Recently, it has been found that snoRNAs can have functions not related to rRNA. One such function is the regulation of alternative splicing
Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be ...
of the ''trans'' gene transcript, which is done by the snoRNA HBII-52, which is also known as SNORD115.
In November 2012, Schubert et al. revealed that specific RNAs control chromatin compaction and accessibility in ''Drosophila'' cells.
References
External links
human snoRNA atlas from small RNA sequencing data
plant snoRNA database
snoRNAbase: human H/ACA and C/D box snoRNA database
snoRNP Database
The yeast snoRNA database
human snoRNA expression pattern
Rfam page for C/D box snoRNAs
Rfam page for H/ACA box snoRNAs
Rfam page for scaRNA snoRNAs
{{Small nucleolar RNA
RNA
Molecular genetics
Non-coding RNA