The Info List - DNA-encoded Chemical Library

--- Advertisement ---

(i) (i) (i) (i) (i)

DNA-ENCODED CHEMICAL LIBRARIES (DEL) is a technology for the synthesis and screening on unprecedented scale of collections of small molecule compounds. DEL is used in medicinal chemistry to bridge the fields of combinatorial chemistry and molecular biology . The aim of DEL technology is to accelerate the drug discovery process and in particular early phase discovery activities such as target validation and hit identification.

DEL technology involves the conjugation of chemical compounds or building blocks to short DNA
fragments that serve as identification bar codes and in some cases also direct and control the chemical synthesis. The technique enables the mass creation and interrogation of libraries via affinity selection, typically on an immobilized protein target. A homogeneous method for screening DNA-encoded libraries has recently been developed which uses water-in-oil emulsion technology to isolate, count and identify individual ligand-target complexes in a single-tube approach. In contrast to conventional screening procedures such as high-throughput screening , biochemical assays are not required for binder identification, in principle allowing the isolation of binders to a wide range of proteins historically difficult to tackle with conventional screening technologies. So, in addition to the general discovery of target specific molecular compounds, the availability of binders to pharmacologically important, but so-far “undruggable” target proteins opens new possibilities to develop novel drugs for diseases that could not be treated so far. In eliminating the requirement to initially assess the activity of hits it is hoped and expected that many of the high affinity binders identified will be shown to be active in independent analysis of selected hits, therefore offering an efficient method to identify high quality hits and pharmaceutical leads.


* 1 DNA-encoded chemical libraries and display technologies * 2 History

* 3 Non-evolution based technologies

* 3.1 Split-"> FIG. 1 DNA-ENCODED LIBRARY DISPLAYING CHEMICAL COMPOUNDS Schematic representation of DNA-encoded library displaying chemical compounds directly attached to oligonucleotides. a) Library generated by “stepwise combinatorial” assembling presenting a single oligonucleotide covalently linked to a putative binding molecule. b) Library construct in “combinatorial self-assembling” fashion (Encoded Self-Assembling Chemical library). Multiple pairing oligonucleotides display a covalently linked binding molecule

The concept of DNA-encoding was first described in a theoretical paper by Brenner and Lerner in 1992 in which was proposed to link each molecule of a chemically synthesized entity to a particular oligonucleotide sequence constructed in parallel and to use this encoding genetic tag to identify and enrich active compounds. In 1993 the first practical implementation of this approach was presented by S. Brenner and K. Janda and similarly by the group of M.A. Gallop. Brenner and Janda suggested to generate individual encoded library members by an alternating parallel combinatorial synthesis of the heteropolymeric chemical compound and the appropriate oligonucleotide sequence on the same bead in a “split- through rounds of selection, amplification and translation.



In order to apply combinatorial chemistry for the synthesis of DNA-encoded chemical libraries, a Split-&-Pool approach was pursued. Initially a set of unique DNA-oligonucleotides (N) each containing a specific coding sequence is chemically conjugated to a corresponding set of small organic molecules.Consequently, the oligonucleotide -conjugate compounds are mixed ("Pool") and divided ("Split") into a number of groups (M). In appropriate conditions a second set of building blocks (m) are coupled to the first one and a further oligonucleotide which is coding for the second modification is enzymatically introduced before mixing again. This “split-"> FIG. 3 DNA-ENCODED LIBRARY BY "SPLIT-&-POOL STEPWISE COUPLING OF CODING DNA FRAGMENTS TO NASCENT ORGANIC MOLECULES An initial set of multifunctional building blocks (FGn represents the different orthogonal functional groups) are covalently conjugated to a corresponding encoding oligonucleotide and reacted in a split-&-pool fashion on a specific functional group (FG1 in red) with a suitable collection of reagents. Following enzymatic encoding, a further round of split-"> FIG. 4 ESAC LIBRARY TECHNOLOGY OVERVIEW Small organic molecules are coupled to 5’-amino modified oligonucleotides, containing a hybridization domain and a unique coding sequence, which ensure the identity of the coupled molecule. The ESAC library can be used in single pharmacophore format (a), in affinity maturations of known binders (b), or in de novo selections of binding molecules by self assembling of sublibraries in DNA-double strand format (c) as well as in DNA-triplexes (d). The ESAC library in the selected format is used in a selection and read-out procedure (e). Following incubation of the library (i) with the target protein of choice (ii) and washing of unbound molecules (iii), the oligonucleotide codes of the binding compounds are PCR-amplified and compared with the library without selection on oligonucleotide micro-arrays (iv, v). Identified binders/binding pairs are validated after conjugation (if appropriate) to suitable scaffolds (vi).

Encoded Self-Assembling Chemical (ESAC) libraries rely on the principle that two sublibraries of a size of X members (e.g. 103) containing a constant complementary hybridization domain can yield a combinatorial DNA-duplex library after hybridization with a complexity of X2 uniformly represented library members (e.g. 106). Each sub-library member would consist of an oligonucleotide containing a variable, coding region flanked by a constant DNA
sequence, carrying a suitable chemical modification at the oligonucleotide extremity. The ESAC sublibraries can be used in at least four different embodiments.

* A sub-library can be paired with a complementary oligonucleotide and used as a DNA
encoded library displaying a single covalently linked compound for affinity-based selection experiments. * A sub-library can be paired with an oligonucleotide displaying a known binder to the target, thus enabling affinity maturation strategies. * Two individual sublibraries can be assembled combinatorially and used for the de novo identification of bindentate binding molecules. * Three different sublibraries can be assembled to form a combinatorial triplex library.

Preferential binders isolated from an affinity-based selection can be PCR-amplified and decoded on complementary oligonucleotide microarrays or by concatenation of the codes, subcloning and sequencing . The individual building blocks can eventually be conjugated using suitable linkers to yield a drug-like high-affinity compound. The characteristics of the linker (e.g. length, flexibility, geometry, chemical nature and solubility) influence the binding affinity and the chemical properties of the resulting binder.(FIG.3)

Bio-panning experiments on HSA of a 600-member ESAC library allowed the isolation of the 4-(p-iodophenyl)butanoic moiety. The compound represents the core structure of a series of portable albumin binding molecules and of AlbufluorTM a recently developed fluorescein angiographic contrast agent currently under clinical evaluation.

ESAC technology has been used for the isolation of potent inhibitors of bovine trypsin and for the identification of novel inhibitors of stromelysin-1 ( MMP-3 ), a matrix metalloproteinase involved in both physiological and pathological tissue remodeling processes, as well as in disease processes, such as arthritis and metastasis .



In 2004, D.R. Halpin and P.B. Harbury presented a novel intriguing method for the construction of DNA-encoded libraries. For the first time the DNA-conjugated templates served for both encoding and programming the infrastructure of the “split-"> FIG. 2 DNA-ENCODED LIBRARY BY ‘DNA-TEMPLATED SYNTHESIS’A library of oligonucleotides (i.e. 64 different oligonucleotides) containing three coding regions was hybridized to a library of reagent compound-oligonucleotide conjugates (i.e. 4 reagent oligonucleotide conjugates), able of pairing with the initial coding domain of the template oligonucleotide. After transferring of the compounds on the corresponding oligonucleotide template, the synthesis cycle was repeated the desired number of times with further sets of carrier compound-oligonucleotide conjugates (i.e. two rounds with four carrier compound-oligonucleotide conjugates per round). Subsequently functional selection was performed and the sequence of the binding template amplified by PCR. Thus, DNA-sequencing allowed the identification of the binding molecule.

In 2001 David Liu and co-workers showed that complementary DNA oligonucleotides can be used to assist certain synthetic reactions , which do not efficiently take place in solution at low concentration . A DNA-heteroduplex was used to accelerate the reaction between chemical moieties displayed at the extremities of the two DNA
strands. Furthermore, the "proximity effect", which accelerates bimolecular reaction, was shown to be distance-independent (at least within a distance of 30 nucleotides ). In a sequence-programmed fashion oligonucleotides carrying one chemical reactant group were hybridized to complementary oligonucleotide derivatives carrying a different reactive chemical group. The proximity conferred by the DNA hybridization drastically increases the effective molarity of the reaction reagents attached to the oligonucleotides, enabling the desired reaction to occur even in an aqueous environment at concentrations which are several orders of magnitude lower than those needed for the corresponding conventional organic reaction not DNA-templated. Using a DNA-templated set-up and sequence-programmed synthesis Liu and co-workers generated a 64-member compound DNA encoded library of macrocycles .


THE YOCTOREACTOR (YR) is a 3D proximity-driven approach which exploits the self-assembling nature of DNA
oligonucleotides into 3, 4 or 5-way junctions to direct small molecule synthesis at the center of the junction. Figure 5 illustrates the basic concept with a 4-way DNA junction. FIG. 5 FUNDAMENTAL PRINCIPLE OF THE YOCTOREACTOR. The center of 3, 4 and 5 way DNA
junctions (a 4-way junction is shown here) becomes a yoctoliter -scale reactor where small molecule synthesis is facilitated in what has been termed the YoctoReactor (yR). Colored circles depict the chemical building blocks (BB) which are attached to carefully designed DNA
oligonucleotides (black lines). Upon DNA
annealing the BB are brought into proximity at the center of the DNA
junction where they undergo chemical reaction.

The center of the DNA
junction constitutes a volume on the order of a yoctoliter , hence the name YoctoReactor. This volume contains a single molecule reaction yielding reaction concentrations in the high mM range. The effective concentration facilitated by the DNA
greatly accelerates chemical reactions that otherwise would not take place at the actual concentration several orders of magnitude lower.


Figure 6 illustrates the generation of a yR library using a 3-way DNA junction. FIG. 6 YOCTOREACTOR LIBRARY ASSEMBLY. Stepwise assembly of a DEL library using YoctoReactor technology. A 3-way reactor is shown here. (a) Position 1 (P1) and P2 BB are brought into proximity and undergo a chemical reaction in the presence of a helper oligonucleotide in P3. (b) The structure is purified by polyacrylamide gel electrophoresis (PAGE), the P1 and P2 DNA
is ligated and the P2 linker is cleaved. (c) P3 BB is annealed to the P1-P2 ligation product from step b, and a chemical reaction between P2 and P3 BBs takes place. (d) The reaction product is purified by PAGE, the DNA
is ligated and P3 linker is cleaved yielding a compound (OOO) covalently attached to the folded yR. (e) The yR is dismantled by primer extension yielding a double-stranded display product exposing the reaction product for selection and molecular evolution.

In summary, chemical building-blocks (BB) are attached via cleavable or non-cleavable linkers to three types of bispecific DNA oligonucleotides (oligo-BBs) representing each arm of the yR. To facilitate synthesis in a combinatorial manner, the oligo-BBs are designed such that the DNA
contains (a) the code for an attached BB at the distal end of the oligo (colored lines) and (b) areas of constant DNA
sequence (black lines) to bring about the self-assembly of the DNA into a 3-way junction (independently of the BB) and the subsequent chemical reaction. Chemical reactions are performed via a stepwise procedure and after each step the DNA
is ligated and the product purified by polyacryamide gel electrophoresis. Cleavable linkers (BB-DNA) are used for all but one position yielding a library of small molecules with a single covalent link to the DNA
code. Table 1 outlines how libraries of different sizes can be generated using yR technology. TABLE 1. YOCTOREACTOR LIBRARY SIZE. yR library size is a function of the number of different functionalized oligos used in each position and the number of positions in the DNA

The yR design approach provides an unvarying reaction site with regard to both (a) distance between reactants and (b) sequence environment surrounding the reaction site. Furthermore, the intimate connection between the code and the BB on the oligo-BB moieties which are mixed combinatorially in a single pot confers a high fidelity to the encoding of the library. The code of the synthesized products, furthermore, is not preset, but rather is assembled combinatorially and synthesized in synchronicity with the innate product.


A homogeneous method for screening yoctoreactor libraries (yR) has recently been developed which uses water-in-oil emulsion technology to isolate individual ligand-target complexes. Called Binder Trap Enrichment (BTE), ligands to a protein target are identified by trapping binding pairs (DNA-labelled protein target and yR ligand) in emulsion droplets during dissociation dominated kinetics. Once trapped, the target and ligand DNA
are joined by ligation, thus preserving the binding information.

Hereafter, identification of hits is essentially a counting exercise: information on binding events is deciphered by sequencing and counting the joined DNA
- selective binders are counted with a much higher frequency than random binders. This is possible because random trapping of target and ligand is "diluted" by the high number of water droplets in the emulsion. The low noise and background signal characteristic of BTE is attributed to the "dilution" of the random signal, the lack of surface artifacts and the high fidelity of the yR library and screening method. Screening is performed in a single tube method. Biologically active hits are identified in a single round of BTE characterized by a low false positive rate.

BTE mimics the non-equilibrium nature of in vivo ligand-target interactions and offers the unique possibility to screen for target specific ligands based on ligand-target residence time because the emulsion, which traps the binding complex, is formed during a dynamic dissociation phase.


Following selection from DNA-encoded chemical libraries, the decoding strategy for the fast and efficient identification of the specific binding compounds is crucial for the further development of the DEL technology. So far, Sanger-sequencing -based decoding, microarray -based methodology and high-throughput sequencing techniques represented the main methodologies for the decoding of DNA-encoded library selections.


Although many authors implicitly envisaged a traditional Sanger sequencing -based decoding, the number of codes to sequence simply according to the complexity of the library is definitely an unrealistic task for a traditional Sanger sequencing approach. Nevertheless, the implementation of Sanger sequencing for decoding DNA-encoded chemical libraries in high-throughput fashion was the first to be described. After selection and PCR
amplification of the DNA-tags of the library compounds, concatamers containing multiple coding sequences were generated and ligated into a vector . Following Sanger sequencing of a representative number of the resulting colonies revealed the frequencies of the codes present in the DNA-encoded library sample before and after selection.


microarray is a device for high-throughput investigations widely used in molecular biology and in medicine . It consists of an arrayed series of microscopic spots (‘features’ or ‘locations’) containing few picomoles of oligonucleotides carrying a specific DNA
sequence. This can be a short section of a gene or other DNA
element that are used as probes to hybridize a DNA
or RNA sample under suitable conditions. Probe-target hybridization is usually detected and quantified by fluorescence -based detection of fluorophore -labeled targets to determine relative abundance of the target nucleic acid sequences. Microarray
has been used for the successfully decoding of ESAC DNA-encoded libraries. The coding oligonucleotides representing the individual chemical compounds in the library, are spotted and chemically linked onto the microarray slides, using a BioChip Arrayer robot. Subsequently, the oligonucleotide tags of the binding compounds isolated from the selection are PCR
amplified using a fluorescent primer and hybridized onto the DNA-microarray slide. Afterwards, microarrays are analyzed using a laser scan and spot intensities detected and quantified. The enrichment of the preferential binding compounds is revealed comparing the spots intensity of the DNA-microarray slide before and after selection.


According to the complexity of the DNA
encoded chemical library (typically between 103 and 106 members), a conventional Sanger sequencing based decoding is unlikely to be usable in practice, due both to the high cost per base for the sequencing and to the tedious procedure involved. High throughput sequencing technologies exploited strategies that parallelize the sequencing process displacing the use of capillary electrophoresis and producing thousands or millions of sequences at once. In 2008 was described the first implementation of a high-throughput sequencing technique originally developed for genome sequencing (i.e. "454 technology ") to the fast and efficient decoding of a DNA
encoded chemical library comprising 4000 compounds. This study led to the identification of novel chemical compounds with submicromolar dissociation constants towards streptavidin and definitely shown the feasibility to construct, perform selections and decode DNA-encoded libraries containing millions of chemical compounds.


* Drug discovery * High-throughput screening * Combinatorial chemistry * DNA
sequencing * Phage display
Phage display


* ^ Smith GP (June 1985). "Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface". Science. 228 (4705): 1315–7. PMID 4001944 . doi :10.1126/science.4001944 . * ^ Hoogenboom HR (2002). "Overview of antibody phage-display technology and its applications". Methods Mol. Biol. 178: 1–37. PMID 11968478 . doi :10.1385/1-59259-240-6:001 . * ^ Brenner S, Lerner RA (June 1992). "Encoded combinatorial chemistry" . Proc. Natl. Acad. Sci. U.S.A. 89 (12): 5381–3. PMC 49295  . PMID 1608946 . doi :10.1073/pnas.89.12.5381 . * ^ A B C Nielsen J, Brenner S, Janda KD (1993). "Synthetic methods for the implementation of encoded combinatorial chemistry". Journal of the American Chemical Society. 115 (21): 9812–9813. doi :10.1021/ja00074a063 . * ^ A B Needels MC, Jones DG, Tate EH, Heinkel GL, Kochersperger LM, Dower WJ, Barrett RW, Gallop MA (November 1993). "Generation and screening of an oligonucleotide-encoded synthetic peptide library" . Proc. Natl. Acad. Sci. U.S.A. 90 (22): 10700–4. PMC 47845  . PMID 7504279 . doi :10.1073/pnas.90.22.10700 . * ^ Mukund S. Chorghade (2006). Drug discovery and development. New York: Wiley-Interscience. pp. 129–167. ISBN 0-471-39848-9 . * ^ Heitner TR, Nansen NJ (2009). "Streamlining hit discovery and optimization with a yoctoliter scale DNA
reactor". Expert Opinion on Drug Discovery. 4 (11): 1201–1213. doi :10.1517/17460440903206940 . * ^ A B C D Mannocci L, Zhang Y, Scheuermann J, Leimbacher M, De Bellis G, Rizzi E, Dumelin C, Melkko S, Neri D (November 2008). " High-throughput sequencing allows the identification of binding molecules isolated from DNA-encoded chemical libraries" . Proc. Natl. Acad. Sci. U.S.A. 105 (46): 17670–5. PMC 2584757  . PMID 19001273 . doi :10.1073/pnas.0805130105 . * ^ Buller F, Mannocci L, Zhang Y, Dumelin CE, Scheuermann J, Neri D (November 2008). "Design and synthesis of a novel DNA-encoded chemical library using Diels-Alder
cycloadditions". Bioorg. Med. Chem. Lett. 18 (22): 5926–31. PMID 18674904 . doi :10.1016/j.bmcl.2008.07.038 . * ^ A B C D E F G H Melkko S, Scheuermann J, Dumelin CE, Neri D (May 2004). "Encoded self-assembling chemical libraries". Nat. Biotechnol. 22 (5): 568–74. PMID 15097996 . doi :10.1038/nbt961 . * ^ Lovrinovic M, Niemeyer CM (May 2005). " DNA
microarrays as decoding tools in combinatorial chemistry and chemical biology". Angew. Chem. Int. Ed. Engl. 44 (21): 3179–83. PMID 15861437 . doi :10.1002/anie.200500645 . * ^ Melkko S, Scheuermann J, Dumelin CE, Neri D (May 2004). "Encoded self-assembling chemical libraries". Nat. Biotechnol. 22 (5): 568–74. PMID 15097996 . doi :10.1038/nbt961 . * ^ Dumelin CE, Trüssel S, Buller F, Trachsel E, Bootz F, Zhang Y, Mannocci L, Beck SC, Drumea-Mirancea M, Seeliger MW, Baltes C, Müggler T, Kranz F, Rudin M, Melkko S, Scheuermann J, Neri D (2008). "A portable albumin binder from a DNA-encoded chemical library". Angew. Chem. Int. Ed. Engl. 47 (17): 3196–201. PMID 18366035 . doi :10.1002/anie.200704936 . * ^ Melkko S, Zhang Y, Dumelin CE, Scheuermann J, Neri D (2007). "Isolation of high-affinity trypsin inhibitors from a DNA-encoded chemical library". Angew. Chem. Int. Ed. Engl. 46 (25): 4671–4. PMID 17497616 . doi :10.1002/anie.200700654 . * ^ A B Halpin DR, Harbury PB (July 2004). " DNA
display I. Sequence-encoded routing of DNA
populations" . PLoS Biol. 2 (7): E173. PMC 434148  . PMID 15221027 . doi :10.1371/journal.pbio.0020173 . * ^ A B Halpin DR, Harbury PB (July 2004). " DNA
display II. Genetic manipulation of combinatorial chemistry libraries for small-molecule evolution" . PLoS Biol. 2 (7): E174. PMC 434149  . PMID 15221028 . doi :10.1371/journal.pbio.0020174 . * ^ A B Gartner ZJ, Liu DR (July 2001). "The generality of DNA-templated synthesis as a basis for evolving non-natural small molecules" . J. Am. Chem. Soc. 123 (28): 6961–3. PMC 2820563  . PMID 11448217 . doi :10.1021/ja015873n . * ^ A B Calderone CT, Puckett JW, Gartner ZJ, Liu DR (November 2002). "Directing otherwise incompatible reactions in a single solution by using DNA-templated organic synthesis". Angew. Chem. Int. Ed. Engl. 41 (21): 4104–8. PMID 12412096 . doi :10.1002/1521-3773(20021104)41:213.0.CO;2-O . * ^ Li X, Liu DR (September 2004). "DNA-templated organic synthesis: nature's strategy for controlling chemical reactivity applied to synthetic molecules". Angew. Chem. Int. Ed. Engl. 43 (37): 4848–70. PMID 15372570 . doi :10.1002/anie.200400656 . * ^ A B Gartner ZJ, Tse BN, Grubina R, Doyon JB, Snyder TM, Liu DR (September 2004). "DNA-templated organic synthesis and selection of a library of macrocycles" . Science. 305 (5690): 1601–5. PMC 2814051  . PMID 15319493 . doi :10.1126/science.1102629 . * ^ Sanger F, Nicklen S, Coulson AR (December 1977). "DNA sequencing with chain-terminating inhibitors" . Proc. Natl. Acad. Sci. U.S.A. 74 (12): 5463–7. PMC 431765  . PMID 271968 . doi :10.1073/pnas.74.12.54