The Chemistry Development Kit (CDK) is computer
software
Software is a set of computer programs and associated software documentation, documentation and data (computing), data. This is in contrast to Computer hardware, hardware, from which the system is built and which actually performs the work.
...
, a
library
A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vi ...
in the programming language
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
, for
chemoinformatics and
bioinformatics
Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combin ...
. It is available for
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
,
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
,
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
, and
macOS
macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac (computer), Mac computers. Within the market of ...
. It is
free and open-source software
Free and open-source software (FOSS) is a term used to refer to groups of software consisting of both free software and open-source software where anyone is freely licensed to use, copy, study, and change the software in any way, and the source ...
distributed under the
GNU Lesser General Public License
The GNU Lesser General Public License (LGPL) is a free-software license published by the Free Software Foundation (FSF). The license allows developers and companies to use and integrate a software component released under the LGPL into their own ...
(LGPL) 2.0.
History
The CDK was created by
Christoph Steinbeck, Egon Willighagen and Dan Gezelter, then developers of
Jmol and
JChemPaint, to provide a common code base, on 27–29 September 2000 at the
University of Notre Dame
The University of Notre Dame du Lac, known simply as Notre Dame ( ) or ND, is a private Catholic university, Catholic research university in Notre Dame, Indiana, outside the city of South Bend, Indiana, South Bend. French priest Edward Sorin fo ...
. The first source code release was made on 11 May 2011. Since then more than 100 people have contributed to the project, leading to a rich set of functions, as given below. Between 2004 and 2007, ''CDK News'' was the project's newsletter of which all articles are available from a public archive. Due to an unsteady rate of contributions, the newsletter was put on hold.
Later, unit testing, code quality checking, and
Javadoc
Javadoc (originally cased JavaDoc) is a documentation generator created by Sun Microsystems for the Java language (now owned by Oracle Corporation) for generating API documentation in HTML format from Java source code. The HTML format is used ...
validation was introduced. Rajarshi Guha developed a nightly build system, named Nightly, which is still operating at
Uppsala University
Uppsala University ( sv, Uppsala universitet) is a public research university in Uppsala, Sweden. Founded in 1477, it is the oldest university in Sweden and the Nordic countries still in operation.
The university rose to significance durin ...
. In 2012, the project became a support of the
InChI Trust, to encourage continued development. The library uses JNI-InChI to generate
International Chemical Identifiers (InChIs).
In April 2013, John Mayfield (né May) joined the ranks of release managers of the CDK, to handle the development branch.
Library
The CDK is a library, instead of a user program. However, it has been integrated into various environments to make its functions available. CDK is currently used in several applications, including the programming language
R, CDK-Taverna (a
Taverna workbench
Apache Taverna was an open source software tool for designing and executing workflows, initially created by the myGrid project under the name ''Taverna Workbench'', then a project under the Apache incubator. Taverna allowed users to integrate many ...
plugin),
Bioclipse
The Bioclipse project is a Java-based, open-source, visual platform for chemo- and bioinformatics based on the Eclipse Rich Client Platform (RCP). It gained scripting functionality in 2009, and a command line version in 2021.
Like any RCP applic ...
, PaDEL, and Cinfony. Also, CDK extensions exist for Konstanz Information Miner (
KNIME
KNIME (), the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks ...
) and for
Excel
ExCeL London (an abbreviation for Exhibition Centre London) is an exhibition centre, international convention centre and former hospital in the Custom House area of Newham, East London. It is situated on a site on the northern quay of the ...
, called LICSS
.
In 2008, bits of GPL-licensed code were removed from the library. While those code bits were independent from the main CDK library, and no copylefting was involved, to reduce confusions among users, the ChemoJava project was instantiated.
Major features
Chemoinformatics
* 2D
molecule editor
A molecule editor is a computer program for creating and modifying representations of chemical structures.
Molecule editors can manipulate chemical structure representations in either a simulated two-dimensional space or three-dimensional space, v ...
and generator
* 3D geometry generation
* ring finding
* substructure search using exact structures and
Smiles arbitrary target specification (SMARTS) like
query language
Query languages, data query languages or database query languages (DQL) are computer languages used to make queries in databases and information systems. A well known example is the Structured Query Language (SQL).
Types
Broadly, query language ...
*
QSAR descriptor calculation
* fingerprint calculation, including the ECFP and FCFP fingerprints
*
force field calculations
* many input-output
chemical file format
A chemical file format is a type of data file which is used specifically to depicting molecular data. One of the most widely used is the chemical table file format, which is similar to ''Structure Data Format'' (SDF) files. They are text files ...
s, including
simplified molecular-input line-entry system
The simplified molecular-input line-entry system (SMILES) is a specification in the form of a line notation for describing the structure of chemical species using short ASCII strings. SMILES strings can be imported by most molecule editors ...
(SMILES),
Chemical Markup Language (CML), and
chemical table file (MDL)
* structure generators
*
International Chemical Identifier support, via JNI-InChI
Bioinformatics
* protein active site detection
* cognate ligand detection
* metabolite identification
* pathway databases
* 2D and 3D protein descriptors
General
*
Python wrapper; see Cinfony
*
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum (aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapp ...
wrapper
* active
user community
See also
*
Bioclipse
The Bioclipse project is a Java-based, open-source, visual platform for chemo- and bioinformatics based on the Eclipse Rich Client Platform (RCP). It gained scripting functionality in 2009, and a command line version in 2021.
Like any RCP applic ...
– an Eclipse–RCP based chemo-bioinformatics workbench
*
Blue Obelisk
*
JChemPaint – Java 2D
molecule editor
A molecule editor is a computer program for creating and modifying representations of chemical structures.
Molecule editors can manipulate chemical structure representations in either a simulated two-dimensional space or three-dimensional space, v ...
, applet and application
*
Jmol – Java 3D renderer, applet and application
*
JOELib – Java version of
Open Babel
Open Babel is computer software, a chemical expert system mainly used to interconvert chemical file formats.
About
Due to the strong relationship to informatics this program belongs more to the category cheminformatics than to molecular modellin ...
,
OELib
*
List of free and open-source software packages
This is a list of free and open-source software packages, computer software licensed under free software licenses and open-source licenses. Software that fits the Free Software Definition may be more appropriately called free software; the GNU ...
*
List of software for molecular mechanics modeling
References
External links
*
CDK Wiki– the community wiki
Planet CDK- a blog planet
CDK Google+ pageOpenScience.org
{{Chemistry software
Bioinformatics software
Chemistry software for Linux
Computational chemistry software
Free chemistry software
Free software programmed in Java (programming language)