HOME

TheInfoList



OR:

Bisulfite sequencing (also known as bisulphite sequencing) is the use of
bisulfite The bisulfite ion (IUPAC-recommended nomenclature: hydrogensulfite) is the ion . Salts containing the ion are also known as "sulfite lyes". Sodium bisulfite is used interchangeably with sodium metabisulfite (Na2S2O5). Sodium metabisulfite diss ...
treatment of DNA before routine
sequencing In genetics and biochemistry, sequencing means to determine the primary structure (sometimes incorrectly called the primary sequence) of an unbranched biopolymer. Sequencing results in a symbolic linear depiction known as a sequence which suc ...
to determine the pattern of
methylation In the chemical sciences, methylation denotes the addition of a methyl group on a substrate, or the substitution of an atom (or group) by a methyl group. Methylation is a form of alkylation, with a methyl group replacing a hydrogen atom. These ...
.
DNA methylation DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence. When located in a gene promoter, DNA methylation typically acts ...
was the first discovered
epigenetic In biology, epigenetics is the study of stable phenotypic changes (known as ''marks'') that do not involve alterations in the DNA sequence. The Greek prefix '' epi-'' ( "over, outside of, around") in ''epigenetics'' implies features that are ...
mark, and remains the most studied. In animals it predominantly involves the addition of a
methyl group In organic chemistry, a methyl group is an alkyl derived from methane, containing one carbon atom bonded to three hydrogen atoms, having chemical formula . In formulas, the group is often abbreviated as Me. This hydrocarbon group occurs in ma ...
to the carbon-5 position of
cytosine Cytosine () ( symbol C or Cyt) is one of the four nucleobases found in DNA and RNA, along with adenine, guanine, and thymine ( uracil in RNA). It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attached ( ...
residues of the dinucleotide CpG, and is implicated in repression of transcriptional activity. Treatment of DNA with bisulfite converts cytosine residues to
uracil Uracil () (symbol U or Ura) is one of the four nucleobases in the nucleic acid RNA. The others are adenine (A), cytosine (C), and guanine (G). In RNA, uracil binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced b ...
, but leaves 5-methylcytosine residues unaffected. Therefore, DNA that has been treated with bisulfite retains only methylated cytosines. Thus, bisulfite treatment introduces specific changes in the
DNA sequence DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. T ...
that depend on the methylation status of individual cytosine residues, yielding single-nucleotide resolution information about the methylation status of a segment of DNA. Various analyses can be performed on the altered sequence to retrieve this information. The objective of this analysis is therefore reduced to differentiating between
single nucleotide polymorphisms In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently lar ...
(cytosines and
thymidine Thymidine (symbol dT or dThd), also known as deoxythymidine, deoxyribosylthymine, or thymine deoxyriboside, is a pyrimidine deoxynucleoside. Deoxythymidine is the DNA nucleoside T, which pairs with deoxyadenosine (A) in double-stranded DNA. ...
) resulting from bisulfite conversion (Figure 1).


Methods

Bisulfite sequencing applies routine sequencing methods on bisulfite-treated genomic DNA to determine methylation status at CpG dinucleotides. Other non-sequencing strategies are also employed to interrogate the methylation at specific loci or at a
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ...
-wide level. All strategies assume that bisulfite-induced conversion of unmethylated cytosines to uracil is complete, and this serves as the basis of all subsequent techniques. Ideally, the method used would determine the methylation status separately for each
allele An allele (, ; ; modern formation from Greek ἄλλος ''állos'', "other") is a variation of the same sequence of nucleotides at the same place on a long DNA molecule, as described in leading textbooks on genetics and evolution. ::"The chrom ...
. Alternative methods to bisulfite sequencing include Combined Bisulphite Restriction Analysis and methylated DNA immunoprecipitation (MeDIP). Methodologies to analyze bisulfite-treated DNA are continuously being developed. To summarize these rapidly evolving methodologies, numerous review articles have been written. The methodologies can be generally divided into strategies based on methylation-specific PCR (MSP) (Figure 4), and strategies employing
polymerase chain reaction The polymerase chain reaction (PCR) is a method widely used to rapidly make millions to billions of copies (complete or partial) of a specific DNA sample, allowing scientists to take a very small sample of DNA and amplify it (or a part of it) ...
(PCR) performed under non-methylation-specific conditions (Figure 3). Microarray-based methods use PCR based on non-methylation-specific conditions also.


Non-methylation-specific PCR based methods


Direct sequencing

The first reported method of methylation analysis using bisulfite-treated DNA utilized PCR and standard dideoxynucleotide
DNA sequencing DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. T ...
to directly determine the nucleotides resistant to bisulfite conversion. Primers are designed to be strand-specific as well as bisulfite-specific (i.e., primers containing non-CpG cytosines such that they are not complementary to non-bisulfite-treated DNA), flanking (but not involving) the methylation site of interest. Therefore, it will amplify both methylated and unmethylated sequences, in contrast to methylation-specific PCR. All sites of unmethylated cytosines are displayed as
thymine Thymine () ( symbol T or Thy) is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidin ...
s in the resulting amplified sequence of the sense strand, and as
adenine Adenine () ( symbol A or Ade) is a nucleobase (a purine derivative). It is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The three others are guanine, cytosine and thymine. Its deriv ...
s in the amplified
antisense In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context ...
strand. By incorporating high throughput sequencing adaptors into the PCR primers, PCR products can be sequenced with massively parallel sequencing. Alternatively, and labour-intensively, PCR product can be cloned and sequenced.
Nested PCR Nested polymerase chain reaction (nested PCR) is a modification of polymerase chain reaction intended to reduce non-specific binding in products due to the amplification of unexpected primer binding sites. Polymerase chain reaction Polymerase chai ...
methods can be used to enhance the product for
sequencing In genetics and biochemistry, sequencing means to determine the primary structure (sometimes incorrectly called the primary sequence) of an unbranched biopolymer. Sequencing results in a symbolic linear depiction known as a sequence which suc ...
. All subsequent DNA methylation analysis techniques using bisulfite-treated DNA is based on this report by Frommer et al. (Figure 2). Although most other modalities are not true sequencing-based techniques, the term "bisulfite sequencing" is often used to describe bisulfite-conversion DNA methylation analysis techniques in general.


Pyrosequencing

Pyrosequencing Pyrosequencing is a method of DNA sequencing (determining the order of nucleotides in DNA) based on the "sequencing by synthesis" principle, in which the sequencing is performed by detecting the nucleotide incorporated by a DNA polymerase. Pyrosequ ...
has also been used to analyze bisulfite-treated DNA without using methylation-specific PCR. Following PCR amplification of the region of interest, pyrosequencing is used to determine the bisulfite-converted sequence of specific
CpG site The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s in the region. The ratio of C-to-T at individual sites can be determined quantitatively based on the amount of C and T incorporation during the sequence extension. The main limitation of this method is the cost of the technology. However, Pyrosequencing does well allow for extension to
high-throughput screening High-throughput screening (HTS) is a method for scientific experimentation especially used in drug discovery and relevant to the fields of biology, materials science and chemistry. Using robotics, data processing/control software, liquid handling ...
methods. A further improvement to this technique was recently described by Wong et al., which uses allele-specific primers that incorporate
single-nucleotide polymorphism In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently ...
s into the sequence of the sequencing primer, thus allowing for separate analysis of maternal and paternal
allele An allele (, ; ; modern formation from Greek ἄλλος ''állos'', "other") is a variation of the same sequence of nucleotides at the same place on a long DNA molecule, as described in leading textbooks on genetics and evolution. ::"The chrom ...
s. This technique is of particular usefulness for genomic imprinting analysis.


Methylation-sensitive single-strand conformation analysis (MS-SSCA)

This method is based on the single-strand conformation polymorphism analysis (SSCA) method developed for
single-nucleotide polymorphism In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently ...
(SNP) analysis. SSCA differentiates between single-stranded DNA fragments of identical size but distinct sequence based on differential migration in non-denaturating
electrophoresis Electrophoresis, from Ancient Greek ἤλεκτρον (ḗlektron, "amber") and φόρησις (phórēsis, "the act of bearing"), is the motion of dispersed particles relative to a fluid under the influence of a spatially uniform electric fi ...
. In MS-SSCA, this is used to distinguish between bisulfite-treated, PCR-amplified regions containing the CpG sites of interest. Although SSCA lacks sensitivity when only a single
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecu ...
difference is present, bisulfite treatment frequently makes a number of C-to-T conversions in most regions of interest, and the resulting sensitivity approaches 100%. MS-SSCA also provides semi-quantitative analysis of the degree of DNA methylation based on the ratio of band intensities. However, this method is designed to assess all
CpG site The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s as a whole in the region of interest rather than individual methylation sites.


High resolution melting analysis (HRM)

A further method to differentiate converted from unconverted bisulfite-treated DNA is using high-resolution melting analysis (HRM), a quantitative PCR-based technique initially designed to distinguish SNPs. The PCR
amplicons In molecular biology, an amplicon is a piece of DNA or RNA that is the source and/or product of amplification or replication events. It can be formed artificially, using various methods including polymerase chain reactions (PCR) or ligase chain ...
are analyzed directly by temperature ramping and resulting liberation of an intercalating
fluorescent dye A fluorophore (or fluorochrome, similarly to a chromophore) is a fluorescent chemical compound that can re-emit light upon light excitation. Fluorophores typically contain several combined aromatic groups, or planar or cyclic molecules with se ...
during melting. The degree of methylation, as represented by the C-to-T content in the
amplicon In molecular biology, an amplicon is a piece of DNA or RNA that is the source and/or product of amplification or replication events. It can be formed artificially, using various methods including polymerase chain reactions (PCR) or ligase cha ...
, determines the rapidity of melting and consequent release of the dye. This method allows direct quantitation in a single-tube assay, but assesses methylation in the amplified region as a whole rather than at specific
CpG site The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s.


Methylation-sensitive single-nucleotide primer extension (MS-SnuPE)

MS-SnuPE employs the primer extension method initially designed for analyzing
single-nucleotide polymorphism In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently ...
s. DNA is bisulfite-converted, and bisulfite-specific primers are annealed to the sequence up to the base pair immediately before the CpG of interest. The primer is allowed to extend one base pair into the C (or T) using
DNA polymerase A DNA polymerase is a member of a family of enzymes that catalyze the synthesis of DNA molecules from nucleoside triphosphates, the molecular precursors of DNA. These enzymes are essential for DNA replication and usually work in groups to crea ...
terminating
dideoxynucleotides Dideoxynucleotides are chain-elongating inhibitors of DNA polymerase, used in the Sanger method for DNA sequencing. They are also known as 2',3' because both the 2' and 3' positions on the ribose lack hydroxyl groups, and are abbreviated as '' ...
, and the ratio of C to T is determined quantitatively. A number of methods can be used to determine this C:T ratio. At the beginning, MS-SnuPE relied on radioactive ddNTPs as the reporter of the primer extension. Fluorescence-based methods or
Pyrosequencing Pyrosequencing is a method of DNA sequencing (determining the order of nucleotides in DNA) based on the "sequencing by synthesis" principle, in which the sequencing is performed by detecting the nucleotide incorporated by a DNA polymerase. Pyrosequ ...
can also be used. However, matrix-assisted laser desorption ionization/time-of-flight ( MALDI-TOF)
mass spectrometry Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a '' mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is u ...
analysis to differentiate between the two polymorphic primer extension products can be used, in essence, based on the GOOD assay designed for
SNP genotyping SNP genotyping is the measurement of genetic variations of single nucleotide polymorphisms (SNPs) between members of a species. It is a form of genotyping, which is the measurement of more general genetic variation. SNPs are one of the most common ...
. Ion pair reverse-phase
high-performance liquid chromatography High-performance liquid chromatography (HPLC), formerly referred to as high-pressure liquid chromatography, is a technique in analytical chemistry used to separate, identify, and quantify each component in a mixture. It relies on pumps to pa ...
(IP-RP-
HPLC High-performance liquid chromatography (HPLC), formerly referred to as high-pressure liquid chromatography, is a technique in analytical chemistry used to separate, identify, and quantify each component in a mixture. It relies on pumps to p ...
) has also been used to distinguish primer extension products.


Base-specific cleavage/MALDI-TOF

A recently described method by Ehrich et al. further takes advantage of bisulfite-conversions by adding a base-specific cleavage step to enhance the information gained from the nucleotide changes. By first using in vitro transcription of the region of interest into RNA (by adding an
RNA polymerase In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template. Using the enzyme helicase, RNAP locally opens th ...
promoter site to the PCR primer in the initial amplification),
RNase A Pancreatic ribonuclease family (, ''RNase'', ''RNase I'', ''RNase A'', ''pancreatic RNase'', ''ribonuclease I'', ''endoribonuclease I'', ''ribonucleic phosphatase'', ''alkaline ribonuclease'', ''ribonuclease'', ''gene S glycoproteins'', ''Ceratit ...
can be used to cleave the RNA transcript at base-specific sites. As
RNase A Pancreatic ribonuclease family (, ''RNase'', ''RNase I'', ''RNase A'', ''pancreatic RNase'', ''ribonuclease I'', ''endoribonuclease I'', ''ribonucleic phosphatase'', ''alkaline ribonuclease'', ''ribonuclease'', ''gene S glycoproteins'', ''Ceratit ...
cleaves RNA specifically at cytosine and uracil
ribonucleotides In biochemistry, a ribonucleotide is a nucleotide containing ribose as its pentose component. It is considered a molecular precursor of nucleic acids. Nucleotides are the basic building blocks of DNA and RNA. Ribonucleotides themselves are basic m ...
, base-specificity is achieved by adding incorporating cleavage-resistant
dTTP Deoxythymidine triphosphate (dTTP) is one of the four nucleoside triphosphates that are used in the ''in vivo'' synthesis of DNA. Unlike the other deoxyribonucleoside triphosphates, thymidine triphosphate does not always contain the "deoxy" prefi ...
when cytosine-specific (C-specific) cleavage is desired, and incorporating dCTP when uracil-specific (U-specific) cleavage is desired. The cleaved fragments can then be analyzed by MALDI-TOF. Bisulfite treatment results in either introduction/removal of cleavage sites by C-to-U conversions or shift in fragment mass by G-to-A conversions in the amplified reverse strand. C-specific cleavage will cut specifically at all methylated
CpG site The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s. By analyzing the sizes of the resulting fragments, it is possible to determine the specific pattern of DNA methylation of
CpG site The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s within the region, rather than determining the extent of methylation of the region as a whole. This method demonstrated efficacy for
high-throughput screening High-throughput screening (HTS) is a method for scientific experimentation especially used in drug discovery and relevant to the fields of biology, materials science and chemistry. Using robotics, data processing/control software, liquid handling ...
, allowing for interrogation of numerous
CpG site The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s in multiple tissues in a cost-efficient manner.


Methylation-specific PCR (MSP)

This alternative method of methylation analysis also uses bisulfite-treated DNA but avoids the need to sequence the area of interest. Instead, primer pairs are designed themselves to be "methylated-specific" by including sequences complementing only unconverted 5-methylcytosines, or, on the converse, "unmethylated-specific", complementing
thymine Thymine () ( symbol T or Thy) is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidin ...
s converted from unmethylated cytosines. Methylation is determined by the ability of the specific primer to achieve amplification. This method is particularly useful to interrogate
CpG islands The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG i ...
with possibly high methylation density, as increased numbers of CpG pairs in the primer increase the specificity of the assay. Placing the CpG pair at the 3'-end of the primer also improves the sensitivity. The initial report using MSP described sufficient sensitivity to detect methylation of 0.1% of
allele An allele (, ; ; modern formation from Greek ἄλλος ''állos'', "other") is a variation of the same sequence of nucleotides at the same place on a long DNA molecule, as described in leading textbooks on genetics and evolution. ::"The chrom ...
s. In general, MSP and its related protocols are considered to be the most sensitive when interrogating the methylation status at a specific
locus Locus (plural loci) is Latin for "place". It may refer to: Entertainment * Locus (comics), a Marvel Comics mutant villainess, a member of the Mutant Liberation Front * ''Locus'' (magazine), science fiction and fantasy magazine ** ''Locus Award' ...
. The MethyLight method is based on MSP, but provides a quantitative analysis using quantitative PCR. Methylated-specific primers are used, and a methylated-specific fluorescence reporter probe is also used that anneals to the amplified region. In alternative fashion, the primers or probe can be designed without methylation specificity if discrimination is needed between the CpG pairs within the involved sequences. Quantitation is made in reference to a methylated reference DNA. A modification to this protocol to increase the specificity of the PCR for successfully bisulfite-converted DNA (ConLight-MSP) uses an additional probe to bisulfite-unconverted DNA to quantify this non-specific amplification. Further methodology using MSP-amplified DNA analyzes the products using
melting curve analysis Melting curve analysis is an assessment of the dissociation characteristics of double-stranded DNA during heating. As the temperature is raised, the double strand begins to dissociate leading to a rise in the absorbance intensity, hyperchromicity ...
(Mc-MSP). This method amplifies bisulfite-converted DNA with both methylated-specific and unmethylated-specific primers, and determines the quantitative ratio of the two products by comparing the differential peaks generated in a melting curve analysis. A high-resolution melting analysis method that uses both quantitative PCR and melting analysis has been introduced, in particular, for sensitive detection of low-level methylation


Microarray-based methods

Microarray A microarray is a multiplex lab-on-a-chip. Its purpose is to simultaneously detect the expression of thousands of genes from a sample (e.g. from a tissue). It is a two-dimensional array on a solid substrate—usually a glass slide or silicon ...
-based methods are a logical extension of the technologies available to analyze bisulfite-treated DNA to allow for genome-wide analysis of methylation. Oligonucleotide microarrays are designed using pairs of
oligonucleotide Oligonucleotides are short DNA or RNA molecules, oligomers, that have a wide range of applications in genetic testing, research, and forensics. Commonly made in the laboratory by solid-phase chemical synthesis, these small bits of nucleic acids ...
hybridization probe In molecular biology, a hybridization probe (HP) is a fragment of DNA or RNA of usually 15–10000 nucleotide long which can be radioactively or fluorescently labeled. HP can be used to detect the presence of nucleotide sequences in analyzed RNA ...
s targeting
CpG site The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s of interest. One is complementary to the unaltered methylated sequence, and the other is complementary to the C-to-U-converted unmethylated sequence. The probes are also bisulfite-specific to prevent binding to DNA incompletely converted by bisulfite. The
Illumina Methylation Assay The Illumina Methylation Assay using the Infinium I platform uses 'BeadChip' technology to generate a comprehensive genome-wide profiling of human DNA methylation. Similar to bisulfite sequencing and pyrosequencing, this method quantifies methylat ...
is one such assay that applies the bisulfite sequencing technology on a microarray level to generate genome-wide methylation data.


Limitations


5-Hydroxymethylcytosine

Bisulfite sequencing is used widely across mammalian genomes, however complications have arisen with the discovery of a new mammalian DNA modification 5-hydroxymethylcytosine. 5-Hydroxymethylcytosine converts to cytosine-5-methylsulfonate upon bisulfite treatment, which then reads as a C when sequenced.Huang Y, Pastor WA, Shen Y, Tahiliani M, Liu DR, Rao A. The Behaviour of 5-Hydroxymethylcytosine in Bisulfite Sequencing
PLOS ONE.2010;5(1):e8888
Therefore, bisulfite sequencing cannot discriminate between 5-methylcytosine and 5-hydroxymethylcytosine. This means that the output from bisulfite sequencing can no longer be defined as solely DNA methylation, as it is the composite of 5-methylcytosine and 5-hydroxymethylcytosine. The development of Tet-assisted oxidative bisulfite sequencing by
Chuan He Chuan He () is a Chinese-American chemical biologist, and is currently the John T. Wilson Distinguished Service Professor at the University of Chicago, and an Investigator of the Howard Hughes Medical Institute. He is best known for his work in d ...
at the University of Chicago is now able to distinguish between the two modifications at single base resolution.


Incomplete conversion

Bisulfite sequencing relies on the conversion of every single unmethylated cytosine residue to uracil. If conversion is incomplete, the subsequent analysis will incorrectly interpret the unconverted unmethylated cytosines as methylated cytosines, resulting in
false positive A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test resul ...
results for methylation. Only cytosines in single-stranded DNA are susceptible to attack by bisulfite, therefore denaturation of the DNA undergoing analysis is critical. It is important to ensure that reaction parameters such as temperature and salt concentration are suitable to maintain the DNA in a single-stranded conformation and allow for complete conversion. Embedding the DNA in
agarose Agarose is a heteropolysaccharide, generally extracted from certain red seaweed. It is a linear polymer made up of the repeating unit of agarobiose, which is a disaccharide made up of D-galactose and 3,6-anhydro-L-galactopyranose. Agarose is ...
gel has been reported to improve the rate of conversion by keeping strands of DNA physically separate.


Degradation of DNA during bisulfite treatment

A major challenge in bisulfite sequencing is the degradation of DNA that takes place concurrently with the conversion. The conditions necessary for complete conversion, such as long incubation times, elevated temperature, and high bisulfite concentration, can lead to the degradation of about 90% of the incubated DNA. Given that the starting amount of DNA is often limited, such extensive degradation can be problematic. The degradation occurs as
depurination Depurination is a chemical reaction of purine deoxyribonucleosides, deoxyadenosine and deoxyguanosine, and ribonucleosides, adenosine or guanosine, in which the β-N-glycosidic bond is hydrolytically cleaved releasing a nucleic base, adenine or ...
s resulting in random strand breaks. Therefore, the longer the desired PCR
amplicon In molecular biology, an amplicon is a piece of DNA or RNA that is the source and/or product of amplification or replication events. It can be formed artificially, using various methods including polymerase chain reactions (PCR) or ligase cha ...
, the more limited the number of intact template molecules will likely be. This could lead to the failure of the PCR amplification, or the loss of quantitatively accurate information on methylation levels resulting from the limited sampling of template molecules. Thus, it is important to assess the amount of DNA degradation resulting from the reaction conditions employed, and consider how this will affect the desired
amplicon In molecular biology, an amplicon is a piece of DNA or RNA that is the source and/or product of amplification or replication events. It can be formed artificially, using various methods including polymerase chain reactions (PCR) or ligase cha ...
. Techniques can also be used to minimize DNA degradation, such as cycling the incubation temperature. In 2020, New England Biolabs developed NEBNext Enzymatic Methyl-seq  an alternative enzymatic approach to minimize DNA damage.


Other concerns

A potentially significant problem following bisulfite treatment is incomplete desulfonation of
pyrimidine Pyrimidine (; ) is an aromatic, heterocyclic, organic compound similar to pyridine (). One of the three diazines (six-membered heterocyclics with two nitrogen atoms in the ring), it has nitrogen atoms at positions 1 and 3 in the ring. The othe ...
residues due to inadequate alkalization of the solution. This may inhibit some
DNA polymerases A DNA polymerase is a member of a family of enzymes that catalyze the synthesis of DNA molecules from nucleoside triphosphates, the molecular precursors of DNA. These enzymes are essential for DNA replication and usually work in groups to create ...
, rendering subsequent PCR difficult. However, this situation can be avoided by monitoring the pH of the solution to ensure that desulfonation will be complete. A final concern is that bisulfite treatment greatly reduces the level of complexity in the sample, which can be problematic if multiple PCR reactions are to be performed (2006).
Primer Primer may refer to: Arts, entertainment, and media Films * ''Primer'' (film), a 2004 feature film written and directed by Shane Carruth * ''Primer'' (video), a documentary about the funk band Living Colour Literature * Primer (textbook), a te ...
design is more difficult, and inappropriate cross-hybridization is more frequent.


Applications: genome-wide methylation analysis

The advances in bisulfite sequencing have led to the possibility of applying them at a
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ...
-wide scale, where, previously, global measure of DNA methylation was feasible only using other techniques, such as
Restriction landmark genomic scanning Restriction landmark genomic scanning (RLGS) is a genome analysis method for rapid simultaneous visualization of thousands of landmarks, or restriction sites. Using a combination of restriction enzymes some of which are specific to DNA modificati ...
. The mapping of the human
epigenome An epigenome consists of a record of the chemical changes to the DNA and histone proteins of an organism; these changes can be passed down to an organism's offspring via transgenerational stranded epigenetic inheritance. Changes to the epigenome ...
is seen by many scientists as the logical follow-up to the completion of the
Human Genome Project The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both ...
. This epigenomic information will be important in understanding how the function of the genetic sequence is implemented and regulated. Since the epigenome is less stable than the genome, it is thought to be important in gene-environment interactions. Epigenomic mapping is inherently more complex than
genome sequencing Whole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a ...
, however, since the epigenome is much more variable than the
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ...
. One’s epigenome varies with age, differs between tissues, is altered by environmental factors, and shows aberrations in diseases. Such rich epigenomic mapping, however, representing different ages, tissue types, and disease states, would yield valuable information on the normal function of
epigenetic In biology, epigenetics is the study of stable phenotypic changes (known as ''marks'') that do not involve alterations in the DNA sequence. The Greek prefix '' epi-'' ( "over, outside of, around") in ''epigenetics'' implies features that are ...
marks as well as the mechanisms leading to aging and disease. Direct benefits of epigenomic mapping include probable advances in
cloning Cloning is the process of producing individual organisms with identical or virtually identical DNA, either by natural or artificial means. In nature, some organisms produce clones through asexual reproduction. In the field of biotechnology, c ...
technology. It is believed that failures to produce cloned animals with normal viability and lifespan result from inappropriate patterns of epigenetic marks. Also, aberrant methylation patterns are well characterized in many
cancers Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal bl ...
. Global hypomethylation results in decreased genomic stability, while local hypermethylation of
tumour suppressor gene A tumor suppressor gene (TSG), or anti-oncogene, is a gene that regulates a cell during cell division and replication. If the cell grows uncontrollably, it will result in cancer. When a tumor suppressor gene is mutated, it results in a loss or red ...
promoters often accounts for their loss of function. Specific patterns of methylation are indicative of specific cancer types, have
prognostic Prognosis (Greek: πρόγνωσις "fore-knowing, foreseeing") is a medical term for predicting the likely or expected development of a disease, including whether the signs and symptoms will improve or worsen (and how quickly) or remain stable ...
value, and can help to guide the best course of treatment. Large-scale epigenome mapping efforts are under way around the world and have been organized under the Human Epigenome Project. This is based on a multi-tiered strategy, whereby bisulfite sequencing is used to obtain high-resolution methylation profiles for a limited number of reference epigenomes, while less thorough analysis is performed on a wider spectrum of samples. This approach is intended to maximize the insight gained from a given amount of resources, as high-resolution genome-wide mapping remains a costly undertaking. Gene-set analysis (for example using tools like DAVID and GoSeq) has been shown to be severely biased when applied to high-throughput methylation data (e.g. genome-wide bisulfite sequencing); it has been suggested that this can be corrected using sample label permutations or using a statistical model to control for differences in the numberes of CpG probes / CpG sites that target each gene.


Oxidative bisulfite sequencing

5-Methylcytosine and 5-hydroxymethylcytosine both read as a C in bisulfite sequencing. Oxidative bisulfite sequencing is a method to discriminate between 5-methylcytosine and 5-hydroxymethylcytosine at single base resolution. The method employs a specific (Tet-assisted) chemical oxidation of 5-hydroxymethylcytosine to 5-formylcytosine, which subsequently converts to uracil during bisulfite treatment.Booth MJ, Branco MR, Ficz G, Oxley D, Krueger F, Reik W, et al. quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science
2012;336(6083):934-7
The only base that then reads as a C is 5‑methylcytosine, giving a map of the true methylation status in the DNA sample. Levels of 5‑hydroxymethylcytosine can also be quantified by measuring the difference between bisulfite and oxidative bisulfite sequencing.


See also

*
Reduced representation bisulfite sequencing Reduced representation bisulfite sequencing (RRBS) is an efficient and high-throughput technique for analyzing the genome-wide methylation profiles on a single nucleotide level. It combines restriction enzymes and bisulfite sequencing to enrich f ...


References


External links


Bisulfite conversion protocol

Human Epigenome Project (HEP) - Data
— by the Sanger Institute
The Epigenome Network of Excellence
{{Portal bar, Biology Molecular biology Epigenetics Genomics techniques