HOME

TheInfoList



OR:

In
biology Biology is the scientific study of life. It is a natural science with a broad scope but has several unifying themes that tie it together as a single, coherent field. For instance, all organisms are made up of cells that process hereditary i ...
, a protein structure database is a database that is modeled around the various experimentally determined
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s. The aim of most protein structure databases is to organize and annotate the protein structures, providing the biological community access to the experimental data in a useful way. Data included in protein structure databases often includes three-dimensional coordinates as well as experimental information, such as unit cell dimensions and angles for
x-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
determined structures. Though most instances, in this case either proteins or a specific structure determinations of a protein, also contain sequence information and some databases even provide means for performing sequence based queries, the primary attribute of a structure database is structural information, whereas sequence databases focus on sequence information, and contain no structural information for the majority of entries. Protein structure databases are critical for many efforts in
computational biology Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has fo ...
such as structure based drug design, both in developing the computational methods used and in providing a large experimental dataset used by some methods to provide insights about the function of a protein.


The Protein Data Bank

The Protein Data Bank (PDB) was established in 1971 as the central
archive An archive is an accumulation of historical records or materials – in any medium – or the physical facility in which they are located. Archives contain primary source documents that have accumulated over the course of an individual or ...
of all experimentally determined protein structure data. Today the PDB is maintained by an international consortia collectively known as the
Worldwide Protein Data Bank The Worldwide Protein Data Bank, wwPDB, is an organization that maintains the archive of macromolecular structure. Its mission is to maintain a single Protein Data Bank Archive of macromolecular structural data that is freely and publicly availa ...
(wwPDB). The mission of the wwPDB is to maintain a single archive of
macromolecular A macromolecule is a very large molecule important to biophysical processes, such as a protein or nucleic acid. It is composed of thousands of covalently bonded atoms. Many macromolecules are polymers of smaller molecules called monomers. The ...
structural data that is freely and publicly available to the global community.


List of other protein structure databases

Because the PDB releases data into the
public domain The public domain (PD) consists of all the creative work A creative work is a manifestation of creative effort including fine artwork (sculpture, paintings, drawing, sketching, performance art), dance, writing (literature), filmmaking, ...
, the data has been used in various other protein structure databases. Examples of protein structure databases include (in alphabetical order);
Database of Macromolecular Movements
describes the motions that occur in proteins and other macromolecules, particularly using movies
Dynameomics
a data warehouse of molecular dynamics simulations and analyses of proteins representing all known protein fold families
JenaLib
the Jena Library of Biological Macromolecules is aimed at a better dissemination of information on three-dimensional biopolymer structures with an emphasis on visualization and analysis. ;
ModBase ModBase is a database of annotated comparative protein structure models, containing models for more than 3.8 million unique protein sequences. Models are created by the comparative modeling pipeline ModPipe which relies on the MODELLER program. ...
: a database of three-dimensional protein models calculated by comparative modeling
OCA
a browser-database for protein structure/function - The OCA integrates information from
KEGG KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis i ...
,
OMIM Online Mendelian Inheritance in Man (OMIM) is a continuously updated catalog of human genes and genetic disorders and traits, with a particular focus on the gene-phenotype relationship. , approximately 9,000 of the over 25,000 entries in OMIM r ...
, PDBselect,
Pfam Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The most recent version, Pfam 35.0, was released in November 2021 and contains 19,632 families. Uses ...
,
PubMed PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine (NLM) at the National Institutes of Health maintain the ...
,
SCOP A ( or ) was a poet as represented in Old English poetry. The scop is the Old English counterpart of the Old Norse ', with the important difference that "skald" was applied to historical persons, and scop is used, for the most part, to designa ...
,
SwissProt UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from ...
, and others. ; OPM: provides spatial positions of protein three-dimensional structures with respect to the
lipid bilayer The lipid bilayer (or phospholipid bilayer) is a thin polar membrane made of two layers of lipid molecules. These membranes are flat sheets that form a continuous barrier around all cells. The cell membranes of almost all organisms and many vir ...
.
PDB Lite
derived from OCA, PDB Lite was provided to make it as easy as possible to find and view a macromolecule within the PDB
PDBsum
provides an overview macromolecular structures in the PDB, giving schematic diagrams of the molecules in each structure and of the interactions between them
PDBTM
the Protein Data Bank of
Transmembrane Proteins A transmembrane protein (TP) is a type of integral membrane protein that spans the entirety of the cell membrane. Many transmembrane proteins function as gateways to permit the transport of specific substances across the membrane. They frequentl ...
— a selection of the PDB. ;
PDBWiki PDBWiki was a wiki that functioned as a user-contributed database of protein structure annotations, listing all the protein structures available in the Protein Data Bank (PDB). It ran on the MediaWiki wiki application from 2007 to 2013. The website ...
: a community annotated knowledge base of biological molecular structure

;
ProtCID The Protein Common Interface Database (ProtCID) is a database of similar protein-protein interfaces in crystal structures of homologous proteins. Its main goal is to identify and cluster homodimeric and heterodimeric interfaces observed in mult ...
: The Protein Common Interface Database
ProtCID
is a database of similar protein–protein interfaces in crystal structures of homologous proteins.
Protein
the
NIH The National Institutes of Health, commonly referred to as NIH (with each letter pronounced individually), is the primary agency of the United States government responsible for biomedical and public health research. It was founded in the late ...
protein database, a collection of sequences from several sources, including translations from annotated coding regions in
GenBank The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. It is produced and maintained by the National Center for Biotechnology Information (NCBI; a part ...
,
RefSeq The Reference Sequence (RefSeq) database is an open access, annotated and curated collection of publicly available nucleotide sequences ( DNA, RNA) and their protein products. RefSeq was first introduced in 2000. This database is built by National ...
and
Third Party Annotation Third or 3rd may refer to: Numbers * 3rd, the ordinal form of the cardinal number 3 * , a fraction of one third * 1⁄60 of a ''second'', or 1⁄3600 of a ''minute'' Places * 3rd Street (disambiguation) * Third Avenue (disambiguation) * Hig ...
, as well as records from
SwissProt UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from ...
, PIR, PRF, and PDB ;
Proteopedia Proteopedia is a wiki, 3D encyclopedia of proteins and other molecules. The site contains a page for every entry in the Protein Data Bank (>130,000 pages), as well as pages that are more descriptive of protein structures in general such as acetylch ...
: the collaborative, 3D encyclopedia of proteins and other molecules. A wiki that contains a page for every entry in the PDB (>100,000 pages), with a
Jmol Jmol is computer software for molecular modelling chemical structures in 3-dimensions. Jmol returns a 3D representation of a molecule that may be used as a teaching tool, or for research e.g., in chemistry and biochemistry. It is written in the ...
view that highlights functional sites and ligands. Offers an easy-to-use scene-authoring tool so you don't have to learn Jmol script language to create customized molecular scenes. Custom scenes are easily attached to "green links" in descriptive text that display those scenes in Jmol.
ProteinLounge
a protein databases that includes visuals of protein structure. Also, includes protein pathways and gene sequences including other tools. ;
SCOP A ( or ) was a poet as represented in Old English poetry. The scop is the Old English counterpart of the Old Norse ', with the important difference that "skald" was applied to historical persons, and scop is used, for the most part, to designa ...
: the Structural Classification of Protein

a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known.
SWISS-MODEL Repository
a database of annotated protein models calculated by homology modeling ;
TOPSAN The Open Protein Structure Annotation Network (TOPSAN) is a wiki designed to collect, share and distribute information about protein three-dimensional structures The site runs on the MindTouch software. See also * Protein structure Protein s ...
: the Open Protein Structure Annotation Network — a wiki designed to collect, share and distribute information about protein three-dimensional structures.


References

{{Reflist, 2 Biological databases