HOME

TheInfoList



OR:

Cheminformatics (also known as chemoinformatics) refers to the use of physical chemistry theory with
computer A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
and information science techniques—so called "'' in silico''" techniques—in application to a range of descriptive and prescriptive problems in the field of
chemistry Chemistry is the scientific study of the properties and behavior of matter. It is a physical science within the natural sciences that studies the chemical elements that make up matter and chemical compound, compounds made of atoms, molecules a ...
, including in its applications to
biology Biology is the scientific study of life and living organisms. It is a broad natural science that encompasses a wide range of fields and unifying principles that explain the structure, function, growth, History of life, origin, evolution, and ...
and related molecular fields. Such '' in silico'' techniques are used, for example, by pharmaceutical companies and in academic settings to aid and inform the process of drug discovery, for instance in the design of well-defined combinatorial libraries of synthetic compounds, or to assist in structure-based drug design. The methods can also be used in chemical and allied industries, and such fields as environmental science and pharmacology, where chemical processes are involved or studied.


History

Cheminformatics has been an active field in various guises since the 1970s and earlier, with activity in academic departments and commercial pharmaceutical research and development departments. The term chemoinformatics was defined in its application to drug discovery by F.K. Brown in 1998:; see also
Chemoinformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and optimization.
Since then, both terms, cheminformatics and chemoinformatics, have been used, although, lexicographically, cheminformatics appears to be more frequently used, despite academics in Europe declaring for the variant chemoinformatics in 2006. In 2009, a prominent Springer journal in the field was founded by transatlantic executive editors named the Journal of Cheminformatics.


Background

Cheminformatics combines the scientific working fields of chemistry, computer science, and information science—for example in the areas of topology, chemical graph theory,
information retrieval Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...
and data mining in the chemical space. Cheminformatics can also be applied to data analysis for various industries like
paper Paper is a thin sheet material produced by mechanically or chemically processing cellulose fibres derived from wood, Textile, rags, poaceae, grasses, Feces#Other uses, herbivore dung, or other vegetable sources in water. Once the water is dra ...
and pulp, dyes and such allied industries.


Applications


Storage and retrieval

A primary application of cheminformatics is the storage, indexing, and search of information relating to chemical compounds. The efficient search of such stored information includes topics that are dealt with in computer science, such as data mining, information retrieval, information extraction, and
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
. Related research topics include: * Digital libraries * Unstructured data * Structured data mining and mining of structured data ** Database mining ** Graph mining ** Molecule mining ** Sequence mining ** Tree mining


File formats

The ''in silico'' representation of chemical structures uses specialized formats such as the Simplified molecular input line entry specifications (SMILES) or the XML-based Chemical Markup Language. These representations are often used for storage in large chemical databases. While some formats are suited for visual representations in two- or three-dimensions, others are more suited for studying physical interactions, modeling and docking studies.


Virtual libraries

Chemical data can pertain to real or virtual molecules. Virtual libraries of compounds may be generated in various ways to explore chemical space and hypothesize novel compounds with desired properties. Virtual libraries of classes of compounds (drugs, natural products, diversity-oriented synthetic products) were recently generated using the FOG (fragment optimized growth) algorithm. This was done by using cheminformatic tools to train transition probabilities of a Markov chain on authentic classes of compounds, and then using the Markov chain to generate novel compounds that were similar to the training database.


Virtual screening

In contrast to high-throughput screening, virtual screening involves computationally screening ''in silico'' libraries of compounds, by means of various methods such as docking, to identify members likely to possess desired properties such as biological activity against a given target. In some cases, combinatorial chemistry is used in the development of the library to increase the efficiency in mining the chemical space. More commonly, a diverse library of small molecules or natural products is screened.


Quantitative structure-activity relationship (QSAR)

This is the calculation of quantitative structure–activity relationship and quantitative structure property relationship values, used to predict the activity of compounds from their structures. In this context there is also a strong relationship to chemometrics. Chemical expert systems are also relevant, since they represent parts of chemical knowledge as an ''in silico'' representation. There is a relatively new concept of matched molecular pair analysis or prediction-driven MMPA which is coupled with QSAR model in order to identify activity cliff.


See also

* Bioinformatics * Chemical file format * Chemicalize.org * Cheminformatics toolkits * Chemogenomics * Computational chemistry * Information engineering * Journal of Chemical Information and Modeling * Journal of Cheminformatics * Materials informatics * Molecular design software * Molecular graphics * Molecular Informatics * Molecular modelling * Nanoinformatics * Software for molecular modeling * WorldWide Molecular Matrix * Molecular descriptor


References


Further reading

* * * * * *


External links

{{Authority control Computational chemistry Drug discovery Computational fields of study Applied statistics