Cheminformatic
   HOME

TheInfoList



OR:

Cheminformatics (also known as chemoinformatics) refers to use of physical chemistry theory with
computer A computer is a machine that can be programmed to Execution (computing), carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as C ...
and information science techniques—so called "''in silico''" techniques—in application to a range of descriptive and prescriptive problems in the field of
chemistry Chemistry is the science, scientific study of the properties and behavior of matter. It is a natural science that covers the Chemical element, elements that make up matter to the chemical compound, compounds made of atoms, molecules and ions ...
, including in its applications to biology and related molecular fields. Such ''
in silico In biology and other experimental sciences, an ''in silico'' experiment is one performed on computer or via computer simulation. The phrase is pseudo-Latin for 'in silicon' (correct la, in silicio), referring to silicon in computer chips. It ...
'' techniques are used, for example, by pharmaceutical companies and in academic settings to aid and inform the process of
drug discovery In the fields of medicine, biotechnology and pharmacology, drug discovery is the process by which new candidate medications are discovered. Historically, drugs were discovered by identifying the active ingredient from traditional remedies or by ...
, for instance in the design of well-defined
combinatorial libraries Combinatorial chemistry comprises chemical synthetic methods that make it possible to prepare a large number (tens to thousands or even millions) of compounds in a single process. These compound libraries can be made as mixtures, sets of individua ...
of synthetic compounds, or to assist in
structure-based drug design Drug design, often referred to as rational drug design or simply rational design, is the inventive process of finding new medications based on the knowledge of a biological target. The drug is most commonly an organic small molecule that activ ...
. The methods can also be used in chemical and allied industries, and such fields as
environmental science Environmental science is an interdisciplinary academic field that integrates physics, biology, and geography (including ecology, chemistry, plant science, zoology, mineralogy, oceanography, limnology, soil science, geology and physical geograp ...
and
pharmacology Pharmacology is a branch of medicine, biology and pharmaceutical sciences concerned with drug or medication action, where a drug may be defined as any artificial, natural, or endogenous (from within the body) molecule which exerts a biochemica ...
, where chemical processes are involved or studied.


History

Cheminformatics has been an active field in various guises since the 1970s and earlier, with activity in academic departments and commercial pharmaceutical research and development departments. The term chemoinformatics was defined in its application to drug discovery by F.K. Brown in 1998:; see also
Chemoinformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and optimization.
Since then, both terms, cheminformatics and chemoinformatics, have been used, although, lexicographically, cheminformatics appears to be more frequently used, despite academics in Europe declaring for the variant chemoinformatics in 2006. In 2009, a prominent Springer journal in the field was founded by transatlantic executive editors named the Journal of Cheminformatics.


Background

Cheminformatics combines the scientific working fields of chemistry, computer science, and information science—for example in the areas of topology,
chemical graph theory Chemical graph theory is the topology branch of mathematical chemistry which applies graph theory to mathematical modelling of chemical phenomena. The pioneers of chemical graph theory are Alexandru Balaban, Ante Graovac, Iván Gutman, Haruo Hosoy ...
,
information retrieval Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other co ...
and data mining in the chemical space. Cheminformatics can also be applied to data analysis for various industries like paper and pulp, dyes and such allied industries.


Applications


Storage and retrieval

A primary application of cheminformatics is the storage, indexing, and search of information relating to chemical compounds. The efficient search of such stored information includes topics that are dealt with in computer science, such as data mining, information retrieval, information extraction, and machine learning. Related research topics include: * Digital libraries * Unstructured data * Structured data mining and mining of structured data ** Database mining ** Graph mining ** Molecule mining ** Sequence mining ** Tree mining


File formats

The ''in silico'' representation of chemical structures uses specialized formats such as the Simplified molecular input line entry specifications (SMILES) or the XML-based Chemical Markup Language. These representations are often used for storage in large chemical databases. While some formats are suited for visual representations in two- or three-dimensions, others are more suited for studying physical interactions, modeling and docking studies.


Virtual libraries

Chemical data can pertain to real or virtual molecules. Virtual libraries of compounds may be generated in various ways to explore chemical space and hypothesize novel compounds with desired properties. Virtual libraries of classes of compounds (drugs, natural products, diversity-oriented synthetic products) were recently generated using the FOG (fragment optimized growth) algorithm. This was done by using cheminformatic tools to train transition probabilities of a
Markov chain A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happe ...
on authentic classes of compounds, and then using the Markov chain to generate novel compounds that were similar to the training database.


Virtual screening

In contrast to high-throughput screening, virtual screening involves computationally screening ''in silico'' libraries of compounds, by means of various methods such as docking, to identify members likely to possess desired properties such as
biological activity In pharmacology, biological activity or pharmacological activity describes the beneficial or adverse effects of a drug on living matter. When a drug is a complex chemical mixture, this activity is exerted by the substance's active ingredient or ...
against a given target. In some cases, combinatorial chemistry is used in the development of the library to increase the efficiency in mining the chemical space. More commonly, a diverse library of small molecules or
natural product A natural product is a natural compound or substance produced by a living organism—that is, found in nature. In the broadest sense, natural products include any substance produced by life. Natural products can also be prepared by chemical syn ...
s is screened.


Quantitative structure-activity relationship (QSAR)

This is the calculation of quantitative structure–activity relationship and quantitative structure property relationship values, used to predict the activity of compounds from their structures. In this context there is also a strong relationship to chemometrics. Chemical
expert system In artificial intelligence, an expert system is a computer system emulating the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as if†...
s are also relevant, since they represent parts of chemical knowledge as an ''in silico'' representation. There is a relatively new concept of
matched molecular pair analysis Matched molecular pair analysis (MMPA) is a method in cheminformatics that compares the properties of two molecules that differ only by a single chemical transformation, such as the substitution of a hydrogen atom by a chlorine one. Such pairs of c ...
or prediction-driven MMPA which is coupled with QSAR model in order to identify activity cliff.


See also

*
Bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
* Chemical file format *
Chemicalize.org Chemicalize is an online platform for chemical calculations, search, and text processing. It is developed and owned by ChemAxon and offers various cheminformatics tools in freemium model: chemical property predictions, structure-based and text-ba ...
*
Cheminformatics toolkits Cheminformatics toolkits are notable software development kit A software development kit (SDK) is a collection of software development tools in one installable package. They facilitate the creation of applications by having a compiler, debugger a ...
* Chemogenomics *
Computational chemistry Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of m ...
* Information engineering * Journal of Chemical Information and Modeling * Journal of Cheminformatics * Materials informatics *
Molecular Conceptor The Molecular Conceptor Learning Series, produced by Synergix Ltd, is an interactive computer-based learning suite that teaches the principles and techniques used in everyday drug discovery. The Molecular Conceptor Learning Series comprises five m ...
* Molecular design software * Molecular graphics * Molecular Informatics * Molecular modelling *
Nanoinformatics Nanoinformatics is the application of informatics to nanotechnology. It is an interdisciplinary field that develops methods and software tools for understanding nanomaterials, their properties, and their interactions with biological entities, and ...
* Software for molecular modeling *
WorldWide Molecular Matrix The World Wide Molecular Matrix (WWMM) was a proposed electronic repository for unpublished chemical data. First introduced in 2002 by Peter Murray-Rust and his colleagues in the chemistry department at the University of Cambridge in the United Ki ...
* Molecular descriptor


References


Further reading

* * * * * *


External links

* {{Authority control Computational chemistry Drug discovery Computational fields of study Applied statistics