HOME

TheInfoList



OR:

The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of
gene In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
and
gene product A gene product is the biochemical material, either RNA or protein, resulting from expression of a gene. A measurement of the amount of gene product is sometimes used to infer how active a gene is. Abnormal amounts of gene product can be correlate ...
attributes across all
species In biology, a species is the basic unit of classification and a taxonomic rank of an organism, as well as a unit of biodiversity. A species is often defined as the largest group of organisms in which any two individuals of the appropriate s ...
. More specifically, the project aims to: 1) maintain and develop its
controlled vocabulary Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Controllin ...
of gene and gene product attributes; 2)
annotate An annotation is extra information associated with a particular point in a document or other piece of information. It can be a note that includes a comment or explanation. Annotations are sometimes presented in the margin of book pages. For anno ...
genes and gene products, and assimilate and disseminate annotation data; and 3) provide tools for easy access to all aspects of the data provided by the project, and to enable functional interpretation of experimental data using the GO, for example via enrichment analysis. GO is part of a larger classification effort, the
Open Biomedical Ontologies The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people dedicated to build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a s ...
, being one of the Initial Candidate Members of the
OBO Foundry The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people dedicated to build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a s ...
. Whereas
gene nomenclature Gene nomenclature is the scientific naming of genes, the units of heredity in living organisms. It is also closely associated with protein nomenclature, as genes and the proteins they code for usually have similar nomenclature. An international co ...
focuses on gene and gene products, the Gene Ontology focuses on the function of the genes and gene products. The GO also extends the effort by using markup language to make the data (not only of the genes and their products but also of curated attributes) machine readable, and to do so in a way that is unified across all species (whereas gene nomenclature conventions vary by biological
taxon In biology, a taxon ( back-formation from '' taxonomy''; plural taxa) is a group of one or more populations of an organism or organisms seen by taxonomists to form a unit. Although neither is required, a taxon is usually known by a particular n ...
).


Terms and ontology

From a practical view, an ontology is a representation of something we know about. "Ontologies" consist of representations of things that are detectable or directly observable, and the relationships between those things. There is no universal standard terminology in biology and related domains, and term usages may be specific to a species, research area or even a particular research group. This makes communication and sharing of data more difficult. The Gene Ontology project provides an
ontology In metaphysics, ontology is the philosophical study of being, as well as related concepts such as existence, becoming, and reality. Ontology addresses questions like how entities are grouped into categories and which of these entities exi ...
of defined terms representing
gene product A gene product is the biochemical material, either RNA or protein, resulting from expression of a gene. A measurement of the amount of gene product is sometimes used to infer how active a gene is. Abnormal amounts of gene product can be correlate ...
properties. The ontology covers three domains: *
cellular component Cellular components are the complex biomolecules and structures of which cells, and thus living organisms, are composed. Cells are the structural and functional units of life. The smallest organisms are single cells, while the largest organism ...
, the parts of a
cell Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Locations * Monastic cell, a small room, hut, or cave in which a religious recluse lives, alternatively the small precursor of a monastery ...
or its
extracellular This glossary of biology terms is a list of definitions of fundamental terms and concepts used in biology, the study of life and of living organisms. It is intended as introductory material for novices; for more specific and technical definitions ...
environment; * molecular function, the elemental activities of a gene product at the molecular level, such as binding or
catalysis Catalysis () is the process of increasing the rate of a chemical reaction by adding a substance known as a catalyst (). Catalysts are not consumed in the reaction and remain unchanged after it. If the reaction is rapid and the catalyst recyc ...
; * biological process, operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues,
organs In biology, an organ is a collection of tissues joined in a structural unit to serve a common function. In the hierarchy of life, an organ lies between tissue and an organ system. Tissues are formed from same type cells to act together in a f ...
, and
organism In biology, an organism () is any living system that functions as an individual entity. All organisms are composed of cells (cell theory). Organisms are classified by taxonomy into groups such as multicellular animals, plants, and ...
s. Each GO term within the ontology has a term name, which may be a word or string of words; a unique alphanumeric identifier; a definition with cited sources; and an ontology indicating the domain to which it belongs. Terms may also have synonyms, which are classed as being exactly equivalent to the term name, broader, narrower, or related; references to equivalent concepts in other databases; and comments on term meaning or usage. The GO ontology is structured as a
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one v ...
, and each term has defined relationships to one or more other terms in the same domain, and sometimes to other domains. The GO vocabulary is designed to be species-neutral, and includes terms applicable to
prokaryote A prokaryote () is a single-celled organism that lacks a nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Greek πρό (, 'before') and κάρυον (, 'nut' or 'kernel').Campbell, N. "Biology:Concepts & Conne ...
s and eukaryotes,
single Single may refer to: Arts, entertainment, and media * Single (music), a song release Songs * "Single" (Natasha Bedingfield song), 2004 * "Single" (New Kids on the Block and Ne-Yo song), 2008 * "Single" (William Wei song), 2016 * "Single", by ...
and multicellular organisms. GO is not static, and additions, corrections and alterations are suggested by, and solicited from, members of the research and annotation communities, as well as by those directly involved in the GO project. For example, an annotator may request a specific term to represent a metabolic pathway, or a section of the ontology may be revised with the help of community experts (e.g.). Suggested edits are reviewed by the ontology editors, and implemented where appropriate. The GO ontology and annotation files are freely available from the GO website in a number of formats, or can be accessed online using the GO browser
AmiGO Amigo(s) (Portuguese and Spanish for ''male friend'') may refer to: People * Carlos Amigo Vallejo (born 1934), Spanish Roman Catholic archbishop emeritus of Seville Places Facilities * Amigos School, a bilingual primary school in Cambridge, Ma ...
. The Gene Ontology project also provides downloadable mappings of its terms to other classification systems.


Example term

:id: GO:0000016 :name: lactase activity :ontology: molecular_function :def: "Catalysis of the reaction: lactose + H2O=D-glucose + D-galactose." C:3.2.1.108:synonym: "lactase-phlorizin hydrolase activity" BROAD C:3.2.1.108:synonym: "lactose galactohydrolase activity" EXACT C:3.2.1.108:xref: EC:3.2.1.108 :xref: MetaCyc:LACTASE-RXN :xref: Reactome:20536 :is_a: GO:0004553 ! hydrolase activity, hydrolyzing O-glycosyl compounds Data source:


Annotation

Genome annotation DNA annotation or genome annotation is the process of identifying the locations of genes and all of the coding regions in a genome and determining what those genes do. An annotation (irrespective of the context) is a note added by way of explanati ...
encompasses the practice of capturing data about a gene product, and GO annotations use terms from the GO to do so. Annotations from GO curators are integrated and disseminated on the GO website, where they can be downloaded directly or viewed online using AmiGO. In addition to the gene product identifier and the relevant GO term, GO annotations have at least the following data: The ''reference'' used to make the annotation (e.g. a journal article); An ''evidence code'' denoting the type of evidence upon which the annotation is based; The date and the creator of the annotation Supporting information, depending on GO term and evidence used and supplementary information, such as the conditions the function is observed under, may also be included in a GO annotation. The evidence code comes from a
controlled vocabulary Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Controllin ...
of codes, the Evidence Code Ontology, covering both manual and automated annotation methods. For example, ''Traceable Author Statement'' (TAS) means a curator has read a published scientific paper and the metadata for that annotation bears a citation to that paper; ''Inferred from Sequence Similarity'' (ISS) means a human curator has reviewed the output from a sequence similarity search and verified that it is biologically meaningful. Annotations from automated processes (for example, remapping annotations created using another annotation vocabulary) are given the code ''Inferred from Electronic Annotation'' (IEA). In 2010, over 98% of all GO annotations were inferred computationally, not by curators, but as of July 2, 2019, only about 30% of all GO annotations were inferred computationally. As these annotations are not checked by a human, the GO Consortium considers them to be marginally less reliable and they are commonly to higher level, less detailed terms. Full annotation data sets can be downloaded from the GO website. To support the development of annotation, the GO Consortium provides workshops and mentors new groups of curators and developers. Many
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
algorithms have been designed and implemented to predict Gene Ontology annotations.


Example annotation

:Gene product: Actin, alpha cardiac muscle 1
UniProtKB:P68032
:GO term:
heart contraction ; GO:0060047
(biological process) :Evidence code: Inferred from Mutant Phenotype (IMP) :Reference: :Assigned by: UniProtKB, June 6, 2008 Data source:


Tools

There are a large number of tools available both online and to download that use the data provided by the GO project. The vast majority of these come from third parties; the GO Consortium develops and supports two tools, AmiGO and OBO-Edit. AmiGOAmiGO
-the current official web-based set of tools for searching and browsing the Gene Ontology database
is a web-based application that allows users to query, browse and visualize ontologies and gene product annotation data. It also has a
BLAST Blast or The Blast may refer to: *Explosion, a rapid increase in volume and release of energy in an extreme manner *Detonation, an exothermic front accelerating through a medium that eventually drives a shock front Film * ''Blast'' (1997 film), ...
tool, tools allowing analysis of larger data sets, and an interface to query the GO database directly. AmiGO can be used online at the GO website to access the data provided by the GO Consortium, or can be downloaded and installed for local use on any database employing the GO database schema (e.g.). It is free
open source software Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Open ...
and is available as part of the go-dev software distribution. OBO-Edit is an open source, platform-independent ontology editor developed and maintained by the Gene Ontology Consortium. It is implemented in
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
, and uses a graph-oriented approach to display and edit ontologies. OBO-Edit includes a comprehensive search and filter interface, with the option to render subsets of terms to make them visually distinct; the user interface can also be customized according to user preferences. OBO-Edit also has a reasoner that can infer links that have not been explicitly stated, based on existing relationships and their properties. Although it was developed for biomedical ontologies, OBO-Edit can be used to view, search and edit any ontology. It is freely available to download.


Consortium

The Gene Ontology Consortium is the set of
biological database Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including geno ...
s and research groups actively involved in the gene ontology project. This includes a number of model organism databases and multi-species protein databases, software development groups, and a dedicated editorial office.


History

The Gene Ontology was originally constructed in 1998 by a consortium of researchers studying the
genomes In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding g ...
of three model organisms: ''
Drosophila melanogaster ''Drosophila melanogaster'' is a species of fly (the taxonomic order Diptera) in the family Drosophilidae. The species is often referred to as the fruit fly or lesser fruit fly, or less commonly the " vinegar fly" or "pomace fly". Starting with ...
'' (fruit fly), ''
Mus musculus Mus or MUS may refer to: Abbreviations * MUS, the NATO country code for Mauritius * MUS, the IATA airport code for Minami Torishima Airport * MUS, abbreviation for the Centre for Modern Urban Studies on Campus The Hague, Leiden University, Net ...
'' (mouse), and ''
Saccharomyces cerevisiae ''Saccharomyces cerevisiae'' () (brewer's yeast or baker's yeast) is a species of yeast (single-celled fungus microorganisms). The species has been instrumental in winemaking, baking, and brewing since ancient times. It is believed to have b ...
'' (brewer's or baker's yeast). Many other
Model Organism Databases Model organism databases (MODs) are biological databases, or knowledgebases, dedicated to the provision of in-depth biological data for intensively studied model organisms. MODs allow researchers to easily find background information on large set ...
have joined the Gene Ontology Consortium, contributing not only annotation data, but also contributing to the development of the ontologies and tools to view and apply the data. Many major plant, animal and microorganism databases make a contribution towards this project. As of July 2019, the GO contains 44,945 terms; there are 6,408,283 annotations to 4,467 different biological organisms. There is a significant body of literature on the development and use of the GO, and it has become a standard tool in the bioinformatics arsenal. Their objectives have three aspects: building gene ontology, assigning ontology to gene/gene products and developing software and databases for the first two objects. Several analyses of the Gene Ontology using formal, domain-independent properties of classes (the metaproperties) are also starting to appear. For instance, an ontological analysis of biological ontologies see.


See also

*
Blast2GO Blast2GO, first published in 2005, is a bioinformatics software tool for the automatic, high-throughput functional annotation of novel sequence data ( genes proteins). It makes use of the BLAST algorithm to identify similar sequences to then tr ...
*
Comparative Toxicogenomics Database The Comparative Toxicogenomics Database (CTD) is a public website and research tool launched in November 2004 that curates scientific data describing relationships between chemicals/drugs, genes/proteins, diseases, taxa, phenotypes, GO annotations ...
* DAVID bioinformatics * Interferome *
National Center for Biomedical Ontology The National Center for Biomedical Ontology (NCBO) is one of the National Centers for Biomedical Computing, and is funded by the NIH. Among the goals of the NCBO are to provide tools for discovery and access of biomedical ontologies, which are a t ...
*
Critical Assessment of Function Annotation The Critical Assessment of Functional Annotation (CAFA) is an experiment designed to provide a large-scale assessment of computational methods dedicated to predicting protein function. Different algorithms are evaluated by their ability to predict ...


References


External links


AmiGO
- the current official web-based set of tools for searching and browsing the Gene Ontology database
Gene Ontology Consortium
- official site
PlantRegMap - GO annotation for 165 plant species and GO enrichment Analysis
{{Bioinformatics Biological databases Ontology (information science)