HOME

TheInfoList



OR:

Paired-end tags (PET) (sometimes "Paired-End diTags", or simply "ditags") are the short sequences at the 5’ and 3' ends of a DNA fragment which are unique enough that they (theoretically) exist together only once in a
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
, therefore making the sequence of the DNA in between them available upon search (if full-genome sequence data is available) or upon further sequencing (since tag sites are unique enough to serve as
primer Primer may refer to: Arts, entertainment, and media Films * ''Primer'' (film), a 2004 feature film written and directed by Shane Carruth * ''Primer'' (video), a documentary about the funk band Living Colour Literature * Primer (textbook), a t ...
annealing sites). Paired-end tags (PET) exist in PET libraries with the intervening DNA absent, that is, a PET "represents" a larger fragment of genomic or cDNA by consisting of a short 5' linker sequence, a short 5' sequence tag, a short 3' sequence tag, and a short 3' linker sequence. It was shown conceptually that 13 base pairs are sufficient to map tags uniquely.Fullwood MJ, Wei CL, Liu ET, Ruan Y. 2009. Next-Generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Research. 19:521–532. {{PMID, 19339662 However, longer sequences are more practical for mapping reads uniquely. The
endonucleases Endonucleases are enzymes that cleave the phosphodiester bond within a polynucleotide chain. Some, such as deoxyribonuclease I, cut DNA relatively nonspecifically (without regard to sequence), while many, typically called restriction endonucleases ...
(discussed below) used to produce PETs give longer tags (18/20 base pairs and 25/27 base pairs) but sequences of 50–100 base pairs would be optimal for both mapping and cost efficiency. After extracting the PETs from many DNA fragments, they are linked (concatenated) together for efficient sequencing. On average, 20–30 tags could be sequenced with the Sanger method, which has a longer read length. Since the tag sequences are short, individual PETs are well suited for
next-generation sequencing Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation s ...
that has short read lengths and higher throughput. The main advantages of PET sequencing are its reduced cost by sequencing only short fragments, detection of structural variants in the genome, and increased specificity when aligning back to the genome compared to single tags, which involves only one end of the DNA fragment.


Constructing the PET library

PET libraries are typically prepared in two general methods: cloning based and cloning-free based.


Cloning based

Fragmented genomic DNA or complementary DNA (cDNA) of interest is cloned into
plasmid vector A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria; how ...
s. The cloning sites are flanked with adaptor sequences that contain restriction sites for endonucleases (discussed below). Inserts are ligated to the plasmid vectors and individual vectors are then transformed into ''E. coli'' making the PET library. PET sequences are obtained by purifying plasmid and digesting with specific endonuclease leaving two short sequences on the ends of the vectors. Under intramolecular (dilute) conditions, vectors are re-circularized and ligated, leaving only the ditags in the vector. The sequences unique to the clone are now paired together. Depending on the
next-generation sequencing Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation s ...
technique, PET sequences can be left singular, dimerized, or concatenated into long chains.


Cloning-free based

Instead of cloning, adaptors containing the endonuclease sequence are ligated to the ends of fragmented genomic DNA or cDNA. The molecules are then self-circularized and digested with endonuclease, releasing the PET. Before sequencing, these PETs are ligated to adaptors to which PCR primers anneal for amplification. The advantage of cloning based construction of the library is that it maintains the fragments or cDNA intact for future use. However, the construction process is much longer than the cloning-free method. Variations on library construction have been produced by
next-generation sequencing Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation s ...
companies to suit their respective technologies.


Endonucleases

Unlike other endonucleases, the MmeI (type IIS) and EcoP15I (type III)
restriction endonucleases A restriction enzyme, restriction endonuclease, REase, ENase or'' restrictase '' is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class o ...
cut downstream of their target binding sites. MmeI cuts 18/20 base pairs downstream and EcoP15I cuts 25/27 base pairs downstream. As these restriction enzymes bind at their target sequences located in the adaptors, they cut and release vectors that contain short sequences of the fragment or cDNA ligated to them, producing PETs.


PET applications

#DNA-PET: Because PET represent connectivity between the tags, the use of PET in genome re-sequencing has advantages over the use of single reads. This application is called pairwise end sequencing, known colloquially as ''double-barrel shotgun sequencing''. Anchoring one half of the pair uniquely to a single location in the genome allows mapping of the other half that is ambiguous. Ambiguous reads are those that map to more than a single location. This increased efficiency reduces the cost of sequencing as these ambiguous sequences, or reads, would normally be discarded. The connectivity of PET sequences also allows detection of structural variations: insertions, deletions, duplications, inversions,
translocations In genetics, chromosome translocation is a phenomenon that results in unusual rearrangement of chromosomes. This includes balanced and unbalanced translocation, with two main types: reciprocal-, and Robertsonian translocation. Reciprocal translo ...
. During the construction of the PET library, the fragments can be selected to all be of a certain size. After mapping, the PET sequences are thus expected to be consistently a particular distance away from each other. A discrepancy from this distance indicates a structural variation between the PET sequences. For example (Figure on the right): a deletion in the sequenced genome will have reads that map further away than expected in the reference genome as the reference genome will have a segment of DNA that is not present in the sequenced genome. # ChIP-PET: The combined use of chromatin immunoprecipitation (
ChIP Chromatin immunoprecipitation (ChIP) is a type of immunoprecipitation experimental technique used to investigate the interaction between proteins and DNA in the cell. It aims to determine whether specific proteins are associated with specific genom ...
) and PET is used to detect regions of DNA bound by a protein of interest. ChIP-PET has the advantage over single read sequencing by reducing ambiguity of the reads generated. The advantage over chip hybridization (
ChIP-Chip ChIP-on-chip (also known as ChIP-chip) is a technology that combines chromatin immunoprecipitation ('ChIP') with DNA microarray (''"chip"''). Like regular ChIP, ChIP-on-chip is used to investigate interactions between proteins and DNA '' in viv ...
) is that hybridization tiling arrays do not have the statistical sensitivity that sequence reads have. However, ChIP-PET,
ChIP-Seq ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated prote ...
and ChIP-chip have all been highly successful. #
ChIA-PET Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET or ChIA-PETS) is a technique that incorporates chromatin immunoprecipitation (ChIP)-based enrichment, chromatin proximity ligation, Paired-End Tags, and High-throughput sequen ...
: The application of PET sequencing on chromatin interaction analysis. It is a genome-wide strategy for finding ''de novo'' long-range interactions between DNA elements bound by protein factors.Fullwood MJ, Liu MH, Pan YF et al. 2009. An oestrogen-receptor-alpha-bound human chromatin interactome. Nature. 462: 58-64. The first ChIA-PET was developed by Fullwood ''et al.''. (2009) to generate a map of the interactions between chromatin bound by oestrogen receptor α (ER-α) in oestrogen-treated human breast
adenocarcinoma Adenocarcinoma (; plural adenocarcinomas or adenocarcinomata ) (AC) is a type of cancerous tumor that can occur in several parts of the body. It is defined as neoplasia of epithelial tissue that has glandular origin, glandular characteristics, or ...
cells. ChIA-PET is an unbiased way to analyze interactions and higher-order chromatin structures because it can detect interactions between unknown DNA elements. In contrast, 3C and 4C methods are used to detect interactions involving a specific target region in the genome. ChIA-PET is similar to finding
fusion genes A fusion gene is a hybrid gene formed from two previously independent genes. It can occur as a result of translocation, interstitial deletion, or chromosomal inversion. Fusion genes have been found to be prevalent in all main types of human neopla ...
through RNA-PET in that the paired tags map to different regions in the genome. However, ChIA-PET involves artificial ligations between different DNA fragments located at different genomic regions, rather than naturally occurring fusion between two genomic regions as in RNA-PET. #RNA-PET: This application is used for studying the
transcriptome The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The t ...
: transcripts, gene structures, and gene expressions.Ng P, Wei CL, Sung WK et al. 2005. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat. Methods. 2: 105–111. The PET library is generated using full length cDNAs, so the ditags represent the 5’ capped and the 3’ polyA tail signatures of individual transcripts. Therefore, RNA-PET is especially useful for demarcating the boundaries of transcription units. This will help identify alternative transcription start sites and
polyadenylation Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In euk ...
sites of genes. RNA-PET could also be used to detect
fusion genes A fusion gene is a hybrid gene formed from two previously independent genes. It can occur as a result of translocation, interstitial deletion, or chromosomal inversion. Fusion genes have been found to be prevalent in all main types of human neopla ...
and
trans-splicing ''Trans''-splicing is a special form of RNA processing where exons from two different primary RNA transcripts are joined end to end and ligated. It is usually found in eukaryotes and mediated by the spliceosome, although some bacteria and archaea ...
, but further experiment is needed to distinguish between them.Ruan Y, Ooi HS, Choo SW et al. 2007. Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs). Genome Res. 17: 828–838. Other methods of finding the boundaries of transcripts include the single-tag strategies
CAGE A cage is an enclosure often made of mesh, bars, or wires, used to confine, contain or protect something or someone. A cage can serve many purposes, including keeping an animal or person in captivity, capturing an animal or person, and displayin ...
,
SAGE Sage or SAGE may refer to: Plants * ''Salvia officinalis'', common sage, a small evergreen subshrub used as a culinary herb ** Lamiaceae, a family of flowering plants commonly known as the mint or deadnettle or sage family ** ''Salvia'', a large ...
, and the most recent SuperSAGE, with the CAGE and 5’ SAGE defining the transcription start sites and the 3’ SAGE defining the
polyadenylation Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In euk ...
sites. The advantages of PET sequencing over these methods are that PET identify both ends of the transcripts and, at the same time, provide more specificity when mapping back to the genome. Sequencing the cDNAs can reveal the structures of transcripts in great details, but this approach is much more expensive than RNA-PET sequencing, especially for characterizing the whole
transcriptome The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The t ...
. The major limitation of RNA-PET is the lack of information regarding the organization of the internal
exons An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding sequence ...
of transcripts. Therefore, RNA-PET is not suitable for detecting
alternative splicing Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be ...
. In addition, if the
cloning Cloning is the process of producing individual organisms with identical or virtually identical DNA, either by natural or artificial means. In nature, some organisms produce clones through asexual reproduction. In the field of biotechnology, cl ...
procedure is used construct the cDNA library before generating the PETs, cDNAs that are difficult to clone (as a result of long transcripts) would have lower coverage. Similarly, transcripts (or transcript isoforms) with low expression levels would likely be under-represented as well.


References

Molecular biology Laboratory techniques Molecular biology techniques DNA sequencing