HOME

TheInfoList



OR:

dcGO is a comprehensive ontology database for protein domains. As an ontology resource, dcGO integrates
Open Biomedical Ontologies The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people dedicated to build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a s ...
from a variety of contexts, ranging from functional information like Gene Ontology to others on enzymes and pathways, from phenotype information across major model organisms to information about human diseases and drugs. As a
protein domain In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of ...
resource, dcGO includes annotations to both the individual domains and supra-domains (i.e., combinations of two or more successive domains).


Concepts

There are two key concepts behind dcGO. The first concept is to label
protein domain In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of ...
s with ontology, for example, with Gene Ontology. That is why it is called dcGO, domain-centric Gene Ontology. The second concept is to use ontology-labeled protein domains for, for example, protein function prediction. Put it in a simple way, the first concept is about how to create dcGO resource, and the second concept is about how to use dcGO resource.


Timelines

* In 2010, the algorithm behind the dcGO was initially published as an improvement to the SUPERFAMILY database. * In 2011, the 'dcGO Predictor' was ranked 10th in the 2011 CAFA competition when applied to
Gene Ontology The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: 1) maintain and develop its controlled vocabulary of gene and ge ...
. This predictor is only domain-based method without machine learning. * In 2012, the database was officially released, published in NAR database issue. * In 2013, the webserver was improved to support many analyses using dcGO resource. * In the early 2014, the 'dcGO Predictor' was submitted for both function and phenotype predictions, ranked top in 4th in CAFA phenotype prediction. * In the late 2014, an open-source R package dcGOR was developed to help analyse ontologies and protein domain annotations.


Webserver

Recent use of dcGO is to build a domain network from a functional perspective for cross-ontology comparisons, and to combine with species tree of life (sTOL) to provide a phylogenetic context to function and phenotype.


Software

Open-source softwar
dcGOR
is developed using
R programming language R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinforma ...
to analyse domain-centric ontologies and annotations. Supported analyses include: * easy access to a wide range of ontologies and their domain-centric annotations; * able to build customised ontologies and annotations; * domain-based enrichment analysis and visualisation; * construction of a domain (semantic similarity) network according to ontology annotations; * significance analysis for estimating a contact (statistical significance) network using random walker algorithm; * high-performance parallel computing. Functionalities under active development are: * algorithm and implementations for creating domain-centric ontology annotations; * ontology term prediction for input protein domain architectures; * reconstruction of ancestral discrete characters using maximum likelihood/parsimony.


See also

*
SCOP A ( or ) was a poet as represented in Old English poetry. The scop is the Old English counterpart of the Old Norse ', with the important difference that "skald" was applied to historical persons, and scop is used, for the most part, to designa ...
* Pfam *
InterPro InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them. The contents of InterPro c ...
*
Structural domain In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of s ...
*
Gene Ontology The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: 1) maintain and develop its controlled vocabulary of gene and ge ...


References

{{reflist


External links


SUPERFAMILYSCOP
Biological databases Genetics in the United Kingdom Genomics Protein classification