Background
A typical human cell consists of about 2 x 3.3 billion base pairs of DNA and 600 million mRNA bases. Usually, a mix of millions of cells is used in sequencing the DNA or RNA using traditional methods like Sanger sequencing or Illumina sequencing. By deep sequencing of DNA and RNA from a single cell, cellular functions can be investigated extensively. Like typical next-generation sequencing experiments, single-cell sequencing protocols generally contain the following steps: isolation of a single cell, nucleic acid extraction and amplification, sequencing library preparation, sequencing, and bioinformatic data analysis. It is more challenging to perform single-cell sequencing than sequencing from cells in bulk. The minimal amount of starting materials from a single cell makes degradation, sample loss, and contamination exert pronounced effects on the quality of sequencing data. In addition, due to the picogram level of the number of nucleic acids used, heavy amplification is often needed during sample preparation of single-cell sequencing, resulting in uneven coverage, noise, and inaccurate quantification of sequencing data. Recent technical improvements make single-cell sequencing a promising tool for approaching a set of seemingly inaccessible problems. For example, heterogeneous samples, rare cell types, cell lineage relationships, mosaicism of somatic tissues, analyses of microbes that cannot be cultured, and disease evolution can all be elucidated through single-cell sequencing. Single-cell sequencing was selected as the method of the year 2013 by Nature Publishing Group.Genome (DNA) sequencing
Single-cell DNA genome sequencing involves isolating a single cell, amplifying the whole genome or region of interest, constructing sequencing libraries, and then applying next-generation DNA sequencing (for example Illumina, Ion Torrent, MGI). Single-cell DNA sequencing has been widely applied in mammalian systems to study normal physiology and disease. Single-cell resolution can uncover the roles of genetic mosaicism or intra-tumor genetic heterogeneity in cancer development or treatment response. In the context of microbiomes, a genome from a single unicellular organism is referred to as a single amplified genome (SAG). Advancements in single-cell DNA sequencing have enabled collecting of genomic data from uncultivated prokaryotic species present in complex microbiomes. Although SAGs are characterized by low completeness and significant bias, recent computational advances have achieved the assembly of near-complete genomes from composite SAGs. Data obtained from microorganisms might establish processes for culturing in the future."" Some of the genome assembly tools used in single cell single-cellencing includeMethods
A list of more than 100 different single-cell omics methods has been published. Multiple displacement amplification (MDA) is a widely used technique, enabling amplifying femtograms of DNA from bacterium to micrograms for sequencing. Reagents required for MDA reactions include: random primers and DNA polymerase from bacteriophage phi29. In 30 degree isothermal reaction, DNA is amplified with included reagents. As the polymerases manufacture new strands, a strand displacement reaction takes place, synthesizing multiple copies from each template DNA. At the same time, the strands that were extended antecedently will be displaced. MDA products result in a length of about 12 kb and ranges up to around 100 kb, enabling its use in DNA sequencing. In 2017, a major improvement to this technique, called WGA-X, was introduced by taking advantage of a thermostable mutant of the phi29 polymerase, leading to better genome recovery from individual cells, in particular those with high G+C content. MDA has also been implemented in a microfluidic droplet-based system to achieve a highly parallelized single-cell whole genome amplification. By encapsulating single-cells in droplets for DNA capture and amplification, this method offers reduced bias and enhanced throughput compared to conventional MDA. Another common method is MALBAC. ThAs done in MDA, this method begins with isothermal amplificationbut the primers are flanked with a “common” sequence for downstream PCR amplification. As the preliminary amplicons are generated, the common sequence promotes self-ligation and the formation of “loops” to prevent further amplification. In contrast with MDA, the highly branched DNA network is not formed. Instead,, the loops are denatured in another temperature cycle allowing the fragments to be amplified with PCR. MALBAC has also been implemented in a microfluidic device, but the amplification performance was not significantly improved by encapsulation in nanoliter droplets. Comparing MDA and MALBAC, MDA results in better genome coverage, but MALBAC provides more even coverage across the genome. MDA could be more effective for identifying SNPs, whereas MALBAC is preferred for detecting copy number variants. While performing MDA with a microfluidic device markedly reduces bias and contamination, the chemistry involved in MALBAC does not demonstrate the same potential for improved efficiency. A method particularly suitable for the discovery of genomic structural variation is Single-cell DNA template strand sequencing (a.k.a. Strand-seq). Using the principle of single-cell tri-channel processing, which uses joint modelling of read-orientation, read-depth, and haplotype-phase, Strand-seq enables discovery of the full spectrum of somatic structural variation classes ≥200kb in size. Strand-seq overcomes limitations of whole genome amplification based methods for identification of somatic genetic variation classes in single cells,"" because it is not susceptible against read chimers leading to calling artefacts (discussed in detail in the section below), and is less affected by drop outs. The choice of method depends on the goal of the sequencing because each method presents different advantages.Limitations
MDA of individual cell genomes results in highly uneven genome coverage, i.e. relative overrepresentation and underrepresentation of various regions of the template, leading to loss of some sequences. There are two components to this process: a) stochastic over- and under-amplification of random regions; and b) systematic bias against high %GC regions. The stochastic component may be addressed by pooling single-cell MDA reactions from the same cell type, by employingApplications
Microbiomes are among the main targets of single cell genomics due to the difficulty of culturing the majority of microorganisms in most environments. Single-cell genomics is a powerful way to obtain microbial genome sequences without cultivation. This approach has been widely applied on marine, soil, subsurface, organismal, and other types of microbiomes in order to address a wide array of questions related to microbial ecology, evolution, public health and biotechnology potential. Cancer sequencing is also an emerging application of scDNAseq. Fresh or frozen tumors may be analyzed and categorized with respect to SCNAs, SNVs, and rearrangements quite well using whole-genome DNAS approaches. Cancer scDNAseq is particularly useful for examining the depth of complexity and compound mutations present in amplified therapeutic targets such as receptor tyrosine kinase genes (EGFR, PDGFRA etc.) where conventional population-level approaches of the bulk tumor are not able to resolve the co-occurrence patterns of these mutations within single cells of the tumor. Such overlap may provide redundancy of pathway activation and tumor cell resistance.DNA methylome sequencing
Single-cell DNA methylome sequencing quantifiesMethods
Bisulfite sequencing has become the gold standard in detecting and sequencing 5mC in single cells. Treatment of DNA with bisulfite converts cytosine residues to uracil, but leaves 5-methylcytosine residues unaffected. Therefore, DNA that has been treated with bisulfite retains only methylated cytosines. To obtain the methylome readout, the bisulfite-treated sequence is aligned to an unmodified genome. Whole genome bisulfite sequencing was achieved in single cells in 2014. The method overcomes the loss of DNA associated with the typical procedure, where sequencing adapters are added prior to bisulfite fragmentation. Instead, the adapters are added after the DNA is treated and fragmented with bisulfite, allowing all fragments to be amplified by PCR. Using deep sequencing, this method captures ~40% of the total CpGs in each cell. With existing technology DNA cannot be amplified prior to bisulfite treatment, as the 5mC marks will not be copied by the polymerase. Single-cell reduced representation bisulfite sequencing (scRRBS) is another method. This method leverages the tendency of methylated cytosines to cluster at CpG islands (CGIs) to enrich for areas of the genome with a high CpG content. This reduces the cost of sequencing compared to whole-genome bisulfite sequencing, but limits the coverage of this method. When RRBS is applied to bulk samples, the majority of the CpG sites in gene promoters are detected, but site in gene promoters only account for 10% of CpG sites in the entire genome. In single cells, 40% of the CpG sites from the bulk sample are detected. To increase coverage, this method can also be applied to a small pool of single cells. In a sample of 20 pooled single cells, 63% of the CpG sites from the bulk sample were detected. Pooling single cells is one strategy to increase methylome coverage, but at the cost of obscuring the heterogeneity in the population of cells.Limitations
While bisulfite sequencing remains the most widely used approach for 5mC detection, the chemical treatment is harsh and fragments and degrades the DNA. This effect is exacerbated when moving from bulk samples to single cells. Other methods to detect DNA methylation include methylation-sensitive restriction enzymes. Restriction enzymes also enable the detection of other types of methylation, such as 6mA with DpnI. Nanopore-based sequencing also offers a route for direct methylation sequencing without fragmentation or modification to the original DNA. Nanopore sequencing has been used to sequence the methylomes of bacteria, which are dominated by 6mA and 4mC (as opposed to 5mC in eukaryotes), but this technique has not yet been scaled down to single cells.Applications
Single-cell DNA methylation sequencing has been widely used to explore epigenetic differences in genetically similar cells. To validate these methods during their development, the single-cell methylome data of a mixed population were successfully classified by hierarchal clustering to identify distinct cell types. Another application is studying single cells during the first few cell divisions in early development to understand how different cell types emerge from a single embryo. Single-cell whole-genome bisulfite sequencing has also been used to study rare but highly active cell types in cancer such as circulating tumor cells (CTCs).Transposase-accessible chromatin sequencing (scATAC-seq)
Single cell transposase-accessible chromatin sequencing maps chromatin accessibility across the genome. A transposase inserts sequencing adapters directly into open regions of chromatin, allowing those regions to be amplified and sequenced.Transcriptome sequencing (scRNA-seq)
Standard methods such asMethods
Current scRNA-seq protocols involve isolating single cells and their RNA, and then following the same steps as bulk RNA-seq:Limitations
Most RNA-seq methods depend on poly(A) tail capture to enrich mRNA and deplete abundant and uninformative rRNA. Thus, they are often restricted to sequencing polyadenylated mRNA molecules. However, recent studies are now starting to appreciate the importance of non-poly(A) RNA, such as long-noncoding RNA and microRNAs in gene expression regulation. Small-seq is a single-cell method that captures small RNAs (<300 nucleotides) such as microRNAs, fragments of tRNAs and small nucleolar RNAs in mammalian cells. This method uses a combination of “oligonucleotide masks” (that inhibit the capture of highly abundant 5.8S rRNA molecules) and size selection to exclude large RNA species such as other highly abundant rRNA molecules. To target larger non-poly(A) RNAs, such as long non-coding mRNA, histone mRNA, circular RNA, and enhancer RNA, size selection is not applicable for depleting the highly abundant ribosomal RNA molecules (18S and 28s rRNA). Single-cell RamDA-Seq is a method that achieves this by performing reverse transcription with random priming (random displacement amplification) in the presence of “not so random” (NSR) primers specifically designed to avoid priming on rRNA molecule. While this method successfully captures full-length total RNA transcripts for sequencing and detected a variety of non-poly(A) RNAs with high sensitivity, it has some limitations. The NSR primers were carefully designed according to rRNA sequences in the specific organism (mouse), and designing new primer sets for other species would take considerable effort. Recently, a CRISPR-based method named scDASH (single-cell depletion of abundant sequences by hybridization) demonstrated another approach to depleting rRNA sequences from single-cell total RNA-seq libraries. Bacteria and other prokaryotes are currently not amenable to single-cell RNA-seq due to the lack of polyadenylated mRNA. Thus, the development of single-cell RNA-seq methods that do not depend on poly(A) tail capture will also be instrumental in enabling single-cell resolution microbiome studies. Bulk bacterial studies typically apply general rRNA depletion to overcome the lack of polyadenylated mRNA on bacteria, but at the single-cell level, the total RNA found in one cell is too small. Lack of polyadenylated mRNA and scarcity of total RNA found in single bacteria cells are two important barriers limiting the deployment of scRNA-seq in bacteria.Applications
scRNA-Seq is becoming widely used across biological disciplines includingConsiderations
Isolation of single cells
There are several ways to isolate individual cells prior to whole genome amplification and sequencing. Fluorescence-activated cell sorting (FACS) is a widely used approach. Individual cells can also be collected by micromanipulation, for example by serial dilution or by using a patch pipette or nanotube to harvest a single cell. The advantages of micromanipulation are ease and low cost, but they are laborious and susceptible to misidentification of cell types under microscope. Laser-capture microdissection (LCM) can also be used for collecting single cells. Although LCM preserves the knowledge of the spatial location of a sampled cell within a tissue, it is hard to capture a whole single cell without also collecting the materials from neighboring cells."" High-throughput methods for single cell isolation also include microfluidics. Both FACS and microfluidics are accurate, automatic and capable of isolating unbiased samples. However, both methods require detaching cells from their microenvironments first, thereby causing perturbation to the transcriptional profiles in RNA expression analysis.Number of cells to be analyzed
scRNA-Seq
Generally speaking, for a typical bulk cellSee also
* Single-cell analysis * Single-cell transcriptomics *References
{{Breakthrough of the Year DNA sequencing Molecular biology techniques Biotechnology