ChEMBL
   HOME

TheInfoList



OR:

ChEMBL or ChEMBLdb is a manually curated
chemical database A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data. Types of chemical databases Bioactivit ...
of bioactive molecules with drug-like properties. It is maintained by the
European Bioinformatics Institute The European Bioinformatics Institute (EMBL-EBI) is an Intergovernmental Organization (IGO) which, as part of the European Molecular Biology Laboratory (EMBL) family, focuses on research and services in bioinformatics. It is located on the Well ...
(EBI), of the European Molecular Biology Laboratory (
EMBL The European Molecular Biology Laboratory (EMBL) is an intergovernmental organization dedicated to molecular biology research and is supported by 27 member states, two prospect states, and one associate member state. EMBL was created in 1974 and ...
), based at the
Wellcome Trust Genome Campus The Wellcome Genome Campus is a scientific research campus built in the grounds of Hinxton Hall, Hinxton in Cambridgeshire, England. Campus The Campus is home to some institutes and organisations in genomics and computational biology. The C ...
, Hinxton, UK. The database, originally known as StARlite, was developed by a biotechnology company called Inpharmatica Ltd. later acquired by
Galapagos NV Galapagos NV (formerly known as Galapagos Genomics) is a Belgian pharmaceutical research company which was founded in 1999. Its headquarters are located in Mechelen and has additional locations in Leiden, Romainville, Basel, Milan, Madrid, Boston a ...
. The data was acquired for EMBL in 2008 with an award from The
Wellcome Trust The Wellcome Trust is a charitable foundation focused on health research based in London, in the United Kingdom. It was established in 1936 with legacies from the pharmaceutical magnate Henry Wellcome (founder of one of the predecessors of Glaxo ...
, resulting in the creation of the ChEMBL
chemogenomics Chemogenomics, or chemical genomics, is the systematic screening of targeted chemical libraries of small molecules against individual drug target families (e.g., GPCRs, nuclear receptors, kinases, proteases, etc.) with the ultimate goal of id ...
group at EMBL-EBI, led by John Overington.


Scope and access

The ChEMBL database contains compound bioactivity data against drug targets. Bioactivity is reported in Ki, Kd, IC50, and EC50. Data can be filtered and analyzed to develop compound screening libraries for lead identification during drug discovery. ChEMBL version 2 (ChEMBL_02) was launched in January 2010, including 2.4 million
bioassay A bioassay is an analytical method to determine the concentration or potency of a substance by its effect on living animals or plants (''in vivo''), or on living cells or tissues(''in vitro''). A bioassay can be either quantal or quantitative, dir ...
measurements covering 622,824 compounds, including 24,000 natural products. This was obtained from curating over 34,000 publications across twelve
medicinal chemistry Medicinal or pharmaceutical chemistry is a scientific discipline at the intersection of chemistry and pharmacy involved with designing and developing pharmaceutical drugs. Medicinal chemistry involves the identification, synthesis and developm ...
journals. ChEMBL's coverage of available bioactivity data has grown to become "the most comprehensive ever seen in a public database.". In October 2010 ChEMBL version 8 (ChEMBL_08) was launched, with over 2.97 million bioassay measurements covering 636,269 compounds. ChEMBL_10 saw the addition of the
PubChem PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of ...
confirmatory
assays An assay is an investigative (analytic) procedure in laboratory medicine, mining, pharmacology, environmental biology and molecular biology for qualitatively assessing or quantitatively measuring the presence, amount, or functional activity of a ...
, in order to integrate data that is comparable to the type and class of data contained within ChEMBL. ChEMBLdb can be accessed via a web interface or downloaded by
File Transfer Protocol The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data ...
. It is formatted in a manner amenable to computerized data mining, and attempts to standardize activities between different publications, to enable comparative analysis. ChEMBL is also integrated into other large-scale chemistry resources, including
PubChem PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of ...
and the
ChemSpider ChemSpider is a database of chemicals. ChemSpider is owned by the Royal Society of Chemistry. Database The database contains information on more than 100 million molecules from over 270 data sources including: * EPA DSSTox * U.S. Food and D ...
system of the
Royal Society of Chemistry The Royal Society of Chemistry (RSC) is a learned society (professional association) in the United Kingdom with the goal of "advancing the chemistry, chemical sciences". It was formed in 1980 from the amalgamation of the Chemical Society, the Ro ...
.


Associated resources

In addition to the database, the ChEMBL group have developed tools and resources for data mining. These include Kinase SARfari, an integrated chemogenomics workbench focussed on
kinases In biochemistry, a kinase () is an enzyme that catalysis, catalyzes the transfer of phosphate groups from High-energy phosphate, high-energy, phosphate-donating molecules to specific Substrate (biochemistry), substrates. This process is known as ...
. The system incorporates and links sequence, structure, compounds and screening data. GPCR SARfari is a similar workbench focused on
GPCR G protein-coupled receptors (GPCRs), also known as seven-(pass)-transmembrane domain receptors, 7TM receptors, heptahelical receptors, serpentine receptors, and G protein-linked receptors (GPLR), form a large group of evolutionarily-related p ...
s, and ChEMBL-Neglected Tropical Diseases (ChEMBL-NTD) is a repository for
Open Access Open access (OA) is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers. With open access strictly defined (according to the 2001 definition), or libre op ...
primary screening and medicinal chemistry data directed at
endemic Endemism is the state of a species being found in a single defined geographic location, such as an island, state, nation, country or other defined zone; organisms that are indigenous to a place are not endemic to it if they are also found elsew ...
tropical diseases Tropical diseases are diseases that are prevalent in or unique to tropical and subtropical regions. The diseases are less prevalent in temperate climates, due in part to the occurrence of a cold season, which controls the insect population by forci ...
of the developing regions of the Africa, Asia, and the Americas. The primary purpose of ChEMBL-NTD is to provide a freely accessible and permanent archive and distribution centre for deposited data. July 2012 saw the release of a ne
malaria data service
, sponsored by the Medicines for Malaria Venture
MMV
, aimed at researchers around the globe. The data in this service includes compounds from the Malaria Box screening set, as well as the other donated malaria data found in ChEMBL-NTD.

the ChEMBL virtual machine, was released in October 2013 to allow users to access a complete and free, easy-to-install cheminformatics infrastructure. In December 2013, the operations of the SureChem patent informatics database were transferred to EMBL-EBI. In a portmanteau, SureChem was renamed SureChEMBL. 2014 saw the introduction of the new resourc
ADME SARfari
- a tool for predicting and comparing cross-species ADME targets.


See also


ChEMBL: Quick Tour on EBI Train OnLine
*
ChEBI Chemical Entities of Biological Interest, also known as ChEBI, is a chemical database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies (OBO) effort at the European Bioinf ...
*
DrugBank The DrugBank database is a comprehensive, freely accessible, online database containing information on drugs and drug targets created and maintained by the University of Alberta and The Metabolomics Innovation Centre located in Alberta, Canada. ...


References


External links


ChEMBLdbKinase SARfari

ChEMBL-Neglected Tropical Disease ArchiveGPCR SARfariThe ChEMBL-og
Open data and drug discovery blog run by the ChEMBL team. {{Wellcome Trust Biological databases Chemical databases Science and technology in Cambridgeshire South Cambridgeshire District Wellcome Trust