Dideoxy Termination
   HOME

TheInfoList



OR:

Sanger sequencing is a method of
DNA sequencing DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. Th ...
that involves electrophoresis and is based on the random incorporation of chain-terminating
dideoxynucleotide Dideoxynucleotides are chain-elongating inhibitors of DNA polymerase, used in the Sanger method for DNA sequencing. They are also known as 2',3' because both the 2' and 3' positions on the ribose lack hydroxyl groups, and are abbreviated as '' ...
s by
DNA polymerase A DNA polymerase is a member of a family of enzymes that catalyze the synthesis of DNA molecules from nucleoside triphosphates, the molecular precursors of DNA. These enzymes are essential for DNA replication and usually work in groups to create ...
during
in vitro ''In vitro'' (meaning in glass, or ''in the glass'') studies are performed with microorganisms, cells, or biological molecules outside their normal biological context. Colloquially called "test-tube experiments", these studies in biology an ...
DNA replication In molecular biology, DNA replication is the biological process of producing two identical replicas of DNA from one original DNA molecule. DNA replication occurs in all living organisms acting as the most essential part for biological inheritanc ...
. After first being developed by
Frederick Sanger Frederick Sanger (; 13 August 1918 – 19 November 2013) was an English biochemist who received the Nobel Prize in Chemistry twice. He won the 1958 Chemistry Prize for determining the amino acid sequence of insulin and numerous other p ...
and colleagues in 1977, it became the most widely used sequencing method for approximately 40 years. It was first commercialized by
Applied Biosystems Applied Biosystems is one of various brands under the Life Technologies brand of Thermo Fisher Scientific corporation. The brand is focused on integrated systems for genetic analysis, which include computerized machines and the consumables used w ...
in 1986. More recently, higher volume Sanger sequencing has been replaced by next generation sequencing methods, especially for large-scale, automated
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
analyses. However, the Sanger method remains in wide use for smaller-scale projects and for validation of deep sequencing results. It still has the advantage over short-read sequencing technologies (like Illumina) in that it can produce DNA sequence reads of > 500
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules wi ...
s and maintains a very low error rate with accuracies around 99.99%. Sanger sequencing is still actively being used in efforts for public health initiatives such as sequencing the spike protein from SARS-CoV-2 as well as for the surveillance of norovirus outbreaks through the Center for Disease Control and Prevention's (CDC) CaliciNet surveillance network.


Method

Fluorescent ddNTP molecules The classical chain-termination method requires a single-stranded DNA template, a DNA
primer Primer may refer to: Arts, entertainment, and media Films * ''Primer'' (film), a 2004 feature film written and directed by Shane Carruth * ''Primer'' (video), a documentary about the funk band Living Colour Literature * Primer (textbook), a t ...
, a
DNA polymerase A DNA polymerase is a member of a family of enzymes that catalyze the synthesis of DNA molecules from nucleoside triphosphates, the molecular precursors of DNA. These enzymes are essential for DNA replication and usually work in groups to create ...
, normal deoxynucleotide triphosphates (
dNTP A nucleoside triphosphate is a nucleoside containing a nitrogenous base bound to a 5-carbon sugar (either ribose or deoxyribose), with three phosphate groups bound to the sugar. They are the molecular precursors of both DNA and RNA, which are ...
s), and modified di-deoxynucleotide triphosphates (
ddNTP Dideoxynucleotides are chain-elongating inhibitors of DNA polymerase, used in the Sanger method for DNA sequencing. They are also known as 2',3' because both the 2' and 3' positions on the ribose lack hydroxyl groups, and are abbreviated as '' ...
s), the latter of which terminate DNA strand elongation. These chain-terminating nucleotides lack a 3'- OH group required for the formation of a
phosphodiester bond In chemistry, a phosphodiester bond occurs when exactly two of the hydroxyl groups () in phosphoric acid react with hydroxyl groups on other molecules to form two ester bonds. The "bond" involves this linkage . Discussion of phosphodiesters is ...
between two nucleotides, causing DNA polymerase to cease extension of DNA when a modified ddNTP is incorporated. The ddNTPs may be radioactively or
fluorescent Fluorescence is the emission of light by a substance that has absorbed light or other electromagnetic radiation. It is a form of luminescence. In most cases, the emitted light has a longer wavelength, and therefore a lower photon energy, tha ...
ly labelled for detection in automated sequencing machines. The DNA sample is divided into four separate sequencing reactions, containing all four of the standard deoxynucleotides (dATP, dGTP, dCTP and dTTP) and the DNA polymerase. To each reaction is added only one of the four
dideoxynucleotides Dideoxynucleotides are chain-elongating inhibitors of DNA polymerase, used in the Sanger method for DNA sequencing. They are also known as 2',3' because both the 2' and 3' positions on the ribose lack hydroxyl groups, and are abbreviated as '' ...
(ddATP, ddGTP, ddCTP, or ddTTP), while the other added nucleotides are ordinary ones. The deoxynucleotide concentration should be approximately 100-fold higher than that of the corresponding dideoxynucleotide (e.g. 0.5mM dTTP : 0.005mM ddTTP) to allow enough fragments to be produced while still transcribing the complete sequence (but the concentration of ddNTP also depends on the desired length of sequence). Putting it in a more sensible order, four separate reactions are needed in this process to test all four ddNTPs. Following rounds of template DNA extension from the bound primer, the resulting DNA fragments are heat denatured and separated by size using
gel electrophoresis Gel electrophoresis is a method for separation and analysis of biomacromolecules ( DNA, RNA, proteins, etc.) and their fragments, based on their size and charge. It is used in clinical chemistry to separate proteins by charge or size (IEF ...
. In the original publication of 1977, the formation of base-paired loops of ssDNA was a cause of serious difficulty in resolving bands at some locations. This is frequently performed using a denaturing
polyacrylamide Polyacrylamide (abbreviated as PAM) is a polymer with the formula (-CH2CHCONH2-). It has a linear-chain structure. PAM is highly water-absorbent, forming a soft gel when hydrated. In 2008, an estimated 750,000,000 kg were produced, mainly fo ...
-urea gel with each of the four reactions run in one of four individual lanes (lanes A, T, G, C). The DNA bands may then be visualized by
autoradiography An autoradiograph is an image on an X-ray film or nuclear emulsion produced by the pattern of decay emissions (e.g., beta particles or gamma rays) from a distribution of a radioactive substance. Alternatively, the autoradiograph is also available ...
or UV light, and the DNA sequence can be directly read off the
X-ray film X-ray detectors are devices used to measure the flux, spatial distribution, spectrum, and/or other properties of X-rays. Detectors can be divided into two major categories: imaging detectors (such as photographic plates and X-ray film ( photog ...
or gel image. In the image on the right, X-ray film was exposed to the gel, and the dark bands correspond to DNA fragments of different lengths. A dark band in a lane indicates a DNA fragment that is the result of chain termination after incorporation of a dideoxynucleotide (ddATP, ddGTP, ddCTP, or ddTTP). The relative positions of the different bands among the four lanes, from bottom to top, are then used to read the DNA sequence. Technical variations of chain-termination sequencing include tagging with nucleotides containing radioactive phosphorus for
radiolabel A radioactive tracer, radiotracer, or radioactive label is a chemical compound in which one or more atoms have been replaced by a radionuclide so by virtue of its radioactive decay it can be used to explore the mechanism of chemical reactions by tr ...
ling, or using a primer labeled at the 5' end with a
fluorescent Fluorescence is the emission of light by a substance that has absorbed light or other electromagnetic radiation. It is a form of luminescence. In most cases, the emitted light has a longer wavelength, and therefore a lower photon energy, tha ...
dye. Dye-primer sequencing facilitates reading in an optical system for faster and more economical analysis and automation. The later development by
Leroy Hood Leroy "Lee" Edward Hood (born October 10, 1938) is an American biologist who has served on the faculties at the California Institute of Technology (Caltech) and the University of Washington. Hood has developed ground-breaking scientific instrum ...
and coworkers of fluorescently labeled ddNTPs and primers set the stage for automated, high-throughput DNA sequencing. Chain-termination methods have greatly simplified DNA sequencing. For example, chain-termination-based kits are commercially available that contain the reagents needed for sequencing, pre-aliquoted and ready to use. Limitations include non-specific binding of the primer to the DNA, affecting accurate read-out of the DNA sequence, and DNA secondary structures affecting the fidelity of the sequence.


Dye-terminator sequencing

''Dye-terminator sequencing'' utilizes labelling of the chain terminator ddNTPs, which permits sequencing in a single reaction rather than four reactions as in the labelled-primer method. In dye-terminator sequencing, each of the four dideoxynucleotide chain terminators is labelled with fluorescent dyes, each of which emits light at different
wavelength In physics, the wavelength is the spatial period of a periodic wave—the distance over which the wave's shape repeats. It is the distance between consecutive corresponding points of the same phase on the wave, such as two adjacent crests, tro ...
s. Owing to its greater expediency and speed, dye-terminator sequencing is now the mainstay in automated sequencing. Its limitations include dye effects due to differences in the incorporation of the dye-labelled chain terminators into the DNA fragment, resulting in unequal peak heights and shapes in the electronic DNA sequence trace
chromatogram In chemical analysis, chromatography is a laboratory technique for the separation of a mixture into its components. The mixture is dissolved in a fluid solvent (gas or liquid) called the ''mobile phase'', which carries it through a system (a ...
after
capillary electrophoresis Capillary electrophoresis (CE) is a family of electrokinetic separation methods performed in submillimeter diameter capillaries and in micro- and nanofluidic channels. Very often, CE refers to capillary zone electrophoresis (CZE), but other electr ...
(see figure to the left). This problem has been addressed with the use of modified DNA polymerase enzyme systems and dyes that minimize incorporation variability, as well as methods for eliminating "dye blobs". The dye-terminator sequencing method, along with automated high-throughput DNA sequence analyzers, was used for the vast majority of sequencing projects until the introduction of
next generation sequencing DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The ...
.


Automation and sample preparation

Automated DNA-sequencing instruments (
DNA sequencers A DNA sequencer is a scientific instrument used to automate the DNA sequencing process. Given a sample of DNA, a DNA sequencer is used to determine the order of the four bases: G (guanine), C (cytosine), A (adenine) and T (thymine). This is th ...
) can sequence up to 384 DNA samples in a single batch. Batch runs may occur up to 24 times a day. DNA sequencers separate strands by size (or length) using
capillary electrophoresis Capillary electrophoresis (CE) is a family of electrokinetic separation methods performed in submillimeter diameter capillaries and in micro- and nanofluidic channels. Very often, CE refers to capillary zone electrophoresis (CZE), but other electr ...
, they detect and record dye fluorescence, and output data as fluorescent peak trace
chromatogram In chemical analysis, chromatography is a laboratory technique for the separation of a mixture into its components. The mixture is dissolved in a fluid solvent (gas or liquid) called the ''mobile phase'', which carries it through a system (a ...
s. Sequencing reactions ( thermocycling and labelling), cleanup and re-suspension of samples in a
buffer solution A buffer solution (more precisely, pH buffer or hydrogen ion buffer) is an aqueous solution consisting of a mixture of a weak acid and its conjugate base, or vice versa. Its pH changes very little when a small amount of strong acid or base is ...
are performed separately, before loading samples onto the sequencer. A number of commercial and non-commercial software packages can trim low-quality DNA traces automatically. These programs score the quality of each peak and remove low-quality base peaks (which are generally located at the ends of the sequence). The accuracy of such algorithms is inferior to visual examination by a human operator, but is adequate for automated processing of large sequence data sets.


Applications of dye-terminating sequencing

The field of public health plays many roles to support patient diagnostics as well as environmental surveillance of potential toxic substances and circulating biological pathogens. Public health laboratories (PHL) and other laboratories around the world have played a pivotal role in providing rapid sequencing data for the surveillance of the virus
SARS-CoV-2 Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19 (coronavirus disease 2019), the respiratory illness responsible for the ongoing COVID-19 pandemic. The virus previously had a ...
, causative agent for COVID-19, during the pandemic that was declared a public health emergency on January 30, 2020. Laboratories were tasked with the rapid implementation of sequencing methods and asked to provide accurate data to assist in the decision-making models for the development of policies to mitigate spread of the virus. Many laboratories resorted to next generation sequencing methodologies while others supported efforts with Sanger sequencing. The sequencing efforts of SARS-CoV-2 are many, while most laboratories implemented whole genome sequencing of the virus, others have opted to sequence very specific genes of the virus such as the S-gene, encoding the information needed to produce the spike protein. The high mutation rate of SARS-CoV-2 leads to genetic differences within the S-gene and these differences have played a role in the infectivity of the virus. Sanger sequencing of the S-gene provides a quick, accurate, and more affordable method to retrieving the genetic code. Laboratories in lower income countries may not have the capabilities to implement expensive applications such as next generation sequencing, so Sanger methods may prevail in supporting the generation of sequencing data for surveillance of variants. Sanger sequencing is also the "gold standard" for norovirus surveillance methods for the Center for Disease Control and Prevention's (CDC) CaliciNet network. CalciNet is an outbreak surveillance network that was established in March 2009. The goal of the network is to collect sequencing data of circulating noroviruses in the United States and activate downstream action to determine the source of infection to mitigate the spread of the virus. The CalciNet network has identified many infections as foodborne illnesses. This data can then be published and used to develop recommendations for future action to prevent tainting food. The methods employed for detection of norovirus involve targeted amplification of specific areas of the genome. The amplicons are then sequenced using dye-terminating Sanger sequencing and the chromatograms and sequences generated are analyzed with a software package developed in BioNumerics. Sequences are tracked and strain relatedness is studied to infer epidemiological relevance.


Challenges

Common challenges of DNA sequencing with the Sanger method include poor quality in the first 15-40 bases of the sequence due to primer binding and deteriorating quality of sequencing traces after 700-900 bases. Base calling software such as Phred typically provides an estimate of quality to aid in trimming of low-quality regions of sequences. In cases where DNA fragments are
cloned Cloning is the process of producing individual organisms with identical or virtually identical DNA, either by natural or artificial means. In nature, some organisms produce clones through asexual reproduction. In the field of biotechnology, c ...
before sequencing, the resulting sequence may contain parts of the
cloning vector A cloning vector is a small piece of DNA that can be stably maintained in an organism, and into which a foreign DNA fragment can be inserted for cloning purposes. The cloning vector may be DNA taken from a virus, the cell of a higher organism, or ...
. In contrast, PCR-based cloning and next-generation sequencing technologies based on
pyrosequencing Pyrosequencing is a method of DNA sequencing (determining the order of nucleotides in DNA) based on the "sequencing by synthesis" principle, in which the sequencing is performed by detecting the nucleotide incorporated by a DNA polymerase. Pyrosequ ...
often avoid using cloning vectors. Recently, one-step Sanger sequencing (combined amplification and sequencing) methods such as Ampliseq and SeqSharp have been developed that allow rapid sequencing of target genes without cloning or prior amplification. Current methods can directly sequence only relatively short (300-1000
nucleotides Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules w ...
long) DNA fragments in a single reaction. The main obstacle to sequencing DNA fragments above this size limit is insufficient power of separation for resolving large DNA fragments that differ in length by only one nucleotide.


Microfluidic Sanger sequencing

Microfluidic Sanger sequencing is a
lab-on-a-chip A lab-on-a-chip (LOC) is a device that integrates one or several laboratory functions on a single integrated circuit (commonly called a "chip") of only millimeters to a few square centimeters to achieve automation and high-throughput screening. ...
application for DNA sequencing, in which the Sanger sequencing steps (thermal cycling, sample purification, and capillary electrophoresis) are integrated on a wafer-scale chip using nanoliter-scale sample volumes. This technology generates long and accurate sequence reads, while obviating many of the significant shortcomings of the conventional Sanger method (e.g. high consumption of expensive reagents, reliance on expensive equipment, personnel-intensive manipulations, etc.) by integrating and automating the Sanger sequencing steps. In its modern inception, high-throughput genome sequencing involves fragmenting the genome into small single-stranded pieces, followed by amplification of the fragments by
polymerase chain reaction The polymerase chain reaction (PCR) is a method widely used to rapidly make millions to billions of copies (complete or partial) of a specific DNA sample, allowing scientists to take a very small sample of DNA and amplify it (or a part of it) t ...
(PCR). Adopting the Sanger method, each DNA fragment is irreversibly terminated with the incorporation of a fluorescently labeled dideoxy chain-terminating nucleotide, thereby producing a DNA “ladder” of fragments that each differ in length by one base and bear a base-specific fluorescent label at the terminal base. Amplified base ladders are then separated by capillary array electrophoresis (CAE) with automated, ''in situ'' “finish-line” detection of the fluorescently labeled ssDNA fragments, which provides an ordered sequence of the fragments. These sequence reads are then computer assembled into overlapping or contiguous sequences (termed "contigs") which resemble the full genomic sequence once fully assembled. Sanger methods achieve maximum read lengths of approximately 800 bp (typically 500–600 bp with non-enriched DNA). The longer read lengths in Sanger methods display significant advantages over other sequencing methods especially in terms of sequencing repetitive regions of the genome. A challenge of short-read sequence data is particularly an issue in sequencing new genomes ''(de novo)'' and in sequencing highly rearranged genome segments, typically those seen of cancer genomes or in regions of chromosomes that exhibit structural variation.


Applications of microfluidic sequencing technologies

Other useful applications of DNA sequencing include
single nucleotide polymorphism In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently larg ...
(SNP) detection,
single-strand conformation polymorphism Single-strand conformation polymorphism (SSCP), or single-strand ''chain'' polymorphism, is defined as a conformational difference of single-stranded nucleotide sequences of identical length as induced by differences in the sequences under certain ...
(SSCP) heteroduplex analysis, and
short tandem repeat A microsatellite is a tract of repetitive DNA in which certain DNA motifs (ranging in length from one to six or more base pairs) are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organism's genome ...
(STR) analysis. Resolving DNA fragments according to differences in size and/or conformation is the most critical step in studying these features of the genome.


Device design

The sequencing chip has a four-layer construction, consisting of three 100-mm-diameter glass wafers (on which device elements are microfabricated) and a polydimethylsiloxane (PDMS) membrane. Reaction chambers and capillary electrophoresis channels are etched between the top two glass wafers, which are thermally bonded. Three-dimensional channel interconnections and microvalves are formed by the PDMS and bottom manifold glass wafer. The device consists of three functional units, each corresponding to the Sanger sequencing steps. The thermal cycling (TC) unit is a 250-nanoliter reaction chamber with integrated resistive temperature detector, microvalves, and a surface heater. Movement of reagent between the top all-glass layer and the lower glass-PDMS layer occurs through 500-μm-diameter via-holes. After thermal-cycling, the reaction mixture undergoes purification in the capture/purification chamber, and then is injected into the capillary electrophoresis (CE) chamber. The CE unit consists of a 30-cm capillary which is folded into a compact switchback pattern via 65-μm-wide turns.


Sequencing chemistry

;Thermal cycling :In the TC reaction chamber, dye-terminator sequencing reagent, template DNA, and primers are loaded into the TC chamber and thermal-cycled for 35 cycles ( at 95 °C for 12 seconds and at 60 °C for 55 seconds). ;Purification :The charged reaction mixture (containing extension fragments, template DNA, and excess sequencing reagent) is conducted through a capture/purification chamber at 30 °C via a 33-Volts/cm electric field applied between capture outlet and inlet ports. The capture gel through which the sample is driven, consists of 40 μM of oligonucleotide (complementary to the primers) covalently bound to a polyacrylamide matrix. Extension fragments are immobilized by the gel matrix, and excess primer, template, free nucleotides, and salts are eluted through the capture waste port. The capture gel is heated to 67-75 °C to release extension fragments. ;Capillary electrophoresis :Extension fragments are injected into the CE chamber where they are electrophoresed through a 125-167-V/cm field.


Platforms

The Apollo 100 platform (Microchip Biotechnologies Inc., Dublin, CA)Microchip Biologies Inc
Apollo 100
/ref> integrates the first two Sanger sequencing steps (thermal cycling and purification) in a fully automated system. The manufacturer claims that samples are ready for capillary electrophoresis within three hours of the sample and reagents being loaded into the system. The Apollo 100 platform requires sub-microliter volumes of reagents.


Comparisons to other sequencing techniques

The ultimate goal of high-throughput sequencing is to develop systems that are low-cost, and extremely efficient at obtaining extended (longer) read lengths. Longer read lengths of each single electrophoretic separation, substantially reduces the cost associated with de novo DNA sequencing and the number of templates needed to sequence DNA contigs at a given redundancy. Microfluidics may allow for faster, cheaper and easier sequence assembly.


See also

*
Second-generation sequencing Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation s ...
*
Third-generation sequencing Third-generation sequencing (also known as long-read sequencing) is a class of DNA sequencing methods currently under active development. Third generation sequencing technologies have the capability to produce substantially longer reads than secon ...


References


Further reading

* * {{cite journal , vauthors = Sanger F, Coulson AR, Barrell BG, Smith AJ, Roe BA , title = Cloning in single-stranded bacteriophage as an aid to rapid DNA sequencing , journal = Journal of Molecular Biology , volume = 143 , issue = 2 , pages = 161–178 , date = October 1980 , pmid = 6260957 , doi = 10.1016/0022-2836(80)90196-5


External links


MBI Says New Tool That Automates Sanger Sample Prep Cuts Reagent and Labor Costs
DNA sequencing methods Molecular biology techniques 1977 in biotechnology de:Sanger-Methode