File formats
There are several file formats in the family. The formats were created by MDL Information Systems (MDL), which was acquired byMolfile
An MDL Molfile is a file format for holding information about the atoms, bonds, connectivity and coordinates of a molecule. The molfile consists of some header information, the Connection Table (CT) containing atom info, then bond connections and types, followed by sections for more complex information. The molfile is sufficiently common that most, if not all,Counts line block specification
Bond block specification
The Bond Block is made up of bond lines, one line per bond, with the following format: 111 222 ttt sss xxx rrr ccc where the values are described in the following table:Extended Connection Table (V3000)
The extended (V3000) molfile consists of a regular molfile “no structure” followed by a single molfile appendix that contains the body of the connection table (Ctab). The following figure shows both an alanine structure and the extended molfile corresponding to it. Note that the “no structure” is flagged with the “V3000” instead of the “V2000” version stamp. There are two other changes to the header in addition to the version: * The number of appendix lines is always written as 999, regardless of how many there actually are. (All current readers will disregard the count and stop at M END.) * The “dimensional code” is maintained more explicitly. Thus “3D” really means 3D, although “2D” will be interpreted as 3D if any non-zero Z-coordinates are found. Unlike the V2000 molfile, the V3000 extended Rgroup molfile has the same header format as a non-Rgroup molfile.Counts line
A counts line is required, and must be first. It specifies the number of atoms, bonds, 3D objects, and Sgroups. It also specifies whether or not the CHIRAL flag is set. Optionally, the counts line can specify molregno. This is only used when the regno exceeds 999999 (the limit of the format in the molfile header line). The format of the counts line is:SDF
SDF is one of a family of chemical-data file formats developed by MDL; it is intended especially for structural information. "SDF" stands for structure-data file, and SDF files actually wrap the molfile ( MDL Molfile) format. Multiple records are delimited by lines consisting of four dollar signs ($$$$). A feature of the SDF format is its ability to include associated data. Associated data items are denoted as follows:Other formats of the family
There are other, less commonly used formats of the family: * RXNFile - for representing a single chemical reaction; * RDFile - for representing a list of records with associated data. Each record can contain chemical structures, reactions, textual and tabular data; * RGFile - for representing the Markush structures (deprecated, Molfile V3000 can represent Markush structures); * XDFile - for representing chemical information in XML format.See also
* Chemical file format#Converting Between FormatsReferences
External links