KEGG Pathway
   HOME

TheInfoList



OR:

KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases dealing with genomes,
biological pathway A biological pathway is a series of interactions among molecules in a cell that leads to a certain product or a change in a cell. Such a pathway can trigger the assembly of new molecules, such as a fat or protein. Pathways can also turn genes on a ...
s, diseases,
drug A drug is any chemical substance that causes a change in an organism's physiology or psychology when consumed. Drugs are typically distinguished from food and substances that provide nutritional support. Consumption of drugs can be via insuffla ...
s, and chemical substances. KEGG is utilized for
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
research and education, including data analysis in
genomics Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dim ...
, metagenomics, metabolomics and other omics studies, modeling and simulation in
systems biology Systems biology is the computational modeling, computational and mathematical analysis and modeling of complex biological systems. It is a biology-based interdisciplinary field of study that focuses on complex interactions within biological syst ...
, and translational research in drug development. The KEGG database project was initiated in 1995 by
Minoru Kanehisa (born January 23, 1948) is a Japanese bioinformatician. He is a project professor at Kyoto University, technical director of Pathway Solutions Inc and president of NPO Bioinformatics Japan. He is one of Japan's most recognized and respected bio ...
, professor at the Institute for Chemical Research,
Kyoto University , mottoeng = Freedom of academic culture , established = , type = National university, Public (National) , endowment = ¥ 316 billion (2.4 1000000000 (number), billion USD) , faculty = 3,480 (Teaching Staff) , administrative_staff ...
, under the then ongoing Japanese Human Genome Program. Foreseeing the need for a computerized resource that can be used for biological interpretation of genome sequence data, he started developing the KEGG PATHWAY database. It is a collection of manually drawn KEGG pathway maps representing experimental knowledge on metabolism and various other functions of the cell and the organism. Each pathway map contains a network of molecular interactions and reactions and is designed to link genes in the genome to gene products (mostly proteins) in the pathway. This has enabled the analysis called KEGG pathway mapping, whereby the gene content in the genome is compared with the KEGG PATHWAY database to examine which pathways and associated functions are likely to be encoded in the genome. According to the developers, KEGG is a "computer representation" of the biological system. It integrates building blocks and wiring diagrams of the system—more specifically, genetic building blocks of genes and proteins, chemical building blocks of small molecules and reactions, and wiring diagrams of molecular interaction and reaction networks. This concept is realized in the following databases of KEGG, which are categorized into systems, genomic, chemical, and health information. * Systems information ** PATHWAY:
pathway Pathway or pathways may refer to: Entertainment * ''The Pathway'' (novel), a 1914 work by Gertrude Page *''The Pathway'', a 2001 album by Officium Triste * ''Pathway'' (album), by the Flaming Stars * ''Pathways'' (album) (2010), by the Dave Hol ...
maps for cellular and organismal functions ** MODULE: modules or functional units of genes ** BRITE: hierarchical classifications of biological entities * Genomic information ** GENOME: complete genomes ** GENES: genes and proteins in the complete genomes ** ORTHOLOGY: ortholog groups of genes in the complete genomes * Chemical information ** COMPOUND, GLYCAN: chemical compounds and
glycan The terms glycans and polysaccharides are defined by IUPAC as synonyms meaning "compounds consisting of a large number of monosaccharides linked glycosidically". However, in practice the term glycan may also be used to refer to the carbohydrate p ...
s ** REACTION, RPAIR, RCLASS: chemical reactions ** ENZYME:
enzyme nomenclature Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as products. A ...
* Health information ** DISEASE: human diseases ** DRUG:
approved drugs An approved drug is a medicinal preparation that has been validated for a therapeutic use by a ruling authority of a government. This process is usually specific by country, unless specified otherwise. Process by country United States In the ...
** ENVIRON:
crude drug Crude drugs are plant or animal drugs that contain natural substances that have undergone only the processes of collection and drying. The term natural substances refers to those substances found in nature that have not had man-made changes made i ...
s and health-related substances


Databases


Systems information

The KEGG PATHWAY database, the wiring diagram database, is the core of the KEGG resource. It is a collection of pathway maps integrating many entities including genes, proteins, RNAs, chemical compounds, glycans, and chemical reactions, as well as disease genes and drug targets, which are stored as individual entries in the other databases of KEGG. The pathway maps are classified into the following sections: * Metabolism * Genetic information processing (
transcription Transcription refers to the process of converting sounds (voice, music etc.) into letters or musical notes, or producing a copy of something in another medium, including: Genetics * Transcription (biology), the copying of DNA into RNA, the fir ...
, translation,
replication Replication may refer to: Science * Replication (scientific method), one of the main principles of the scientific method, a.k.a. reproducibility ** Replication (statistics), the repetition of a test or complete experiment ** Replication crisi ...
and repair, etc.) * Environmental information processing (
membrane transport In cellular biology, membrane transport refers to the collection of mechanisms that regulate the passage of solutes such as ions and small molecules through biological membranes, which are lipid bilayers that contain proteins embedded in them. Th ...
,
signal transduction Signal transduction is the process by which a chemical or physical signal is transmitted through a cell as a series of molecular events, most commonly protein phosphorylation catalyzed by protein kinases, which ultimately results in a cellula ...
, etc.) * Cellular processes (
cell growth Cell growth refers to an increase in the total mass of a cell, including both cytoplasmic, nuclear and organelle volume. Cell growth occurs when the overall rate of cellular biosynthesis (production of biomolecules or anabolism) is greater than ...
, cell death, cell membrane functions, etc.) * Organismal systems ( immune system, endocrine system, nervous system, etc.) * Human diseases * Drug development The metabolism section contains aesthetically drawn global maps showing an overall picture of metabolism, in addition to regular metabolic pathway maps. The low-resolution global maps can be used, for example, to compare metabolic capacities of different organisms in genomics studies and different environmental samples in metagenomics studies. In contrast, KEGG modules in the KEGG MODULE database are higher-resolution, localized wiring diagrams, representing tighter functional units within a pathway map, such as subpathways conserved among specific organism groups and molecular complexes. KEGG modules are defined as characteristic gene sets that can be linked to specific metabolic capacities and other phenotypic features, so that they can be used for automatic interpretation of genome and metagenome data. Another database that supplements KEGG PATHWAY is the KEGG BRITE database. It is an ontology database containing hierarchical classifications of various entities including genes, proteins, organisms, diseases, drugs, and chemical compounds. While KEGG PATHWAY is limited to molecular interactions and reactions of these entities, KEGG BRITE incorporates many different types of relationships.


Genomic information

Several months after the KEGG project was initiated in 1995, the first report of the completely sequenced bacterial genome was published. Since then all published complete genomes are accumulated in KEGG for both
eukaryote Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
s and prokaryotes. The KEGG GENES database contains gene/protein-level information and the KEGG GENOME database contains organism-level information for these genomes. The KEGG GENES database consists of gene sets for the complete genomes, and genes in each set are given annotations in the form of establishing correspondences to the wiring diagrams of KEGG pathway maps, KEGG modules, and BRITE hierarchies. These correspondences are made using the concept of orthologs. The KEGG pathway maps are drawn based on experimental evidence in specific organisms but they are designed to be applicable to other organisms as well, because different organisms, such as human and mouse, often share identical pathways consisting of functionally identical genes, called orthologous genes or orthologs. All the genes in the KEGG GENES database are being grouped into such orthologs in the KEGG ORTHOLOGY (KO) database. Because the nodes (gene products) of KEGG pathway maps, as well as KEGG modules and BRITE hierarchies, are given KO identifiers, the correspondences are established once genes in the genome are annotated with KO identifiers by the
genome annotation DNA annotation or genome annotation is the process of identifying the locations of genes and all of the coding regions in a genome and determining what those genes do. An annotation (irrespective of the context) is a note added by way of explanati ...
procedure in KEGG.


Chemical information

The KEGG metabolic pathway maps are drawn to represent the dual aspects of the metabolic network: the genomic network of how genome-encoded enzymes are connected to catalyze consecutive reactions and the chemical network of how chemical structures of
substrate Substrate may refer to: Physical layers *Substrate (biology), the natural environment in which an organism lives, or the surface or medium on which an organism grows or is attached ** Substrate (locomotion), the surface over which an organism lo ...
s and products are transformed by these reactions. A set of enzyme genes in the genome will identify enzyme relation networks when superimposed on the KEGG pathway maps, which in turn characterize chemical structure transformation networks allowing interpretation of biosynthetic and biodegradation potentials of the organism. Alternatively, a set of
metabolite In biochemistry, a metabolite is an intermediate or end product of metabolism. The term is usually used for small molecules. Metabolites have various functions, including fuel, structure, signaling, stimulatory and inhibitory effects on enzymes, c ...
s identified in the metabolome will lead to the understanding of enzymatic pathways and enzyme genes involved. The databases in the chemical information category, which are collectively called KEGG LIGAND, are organized by capturing knowledge of the chemical network. In the beginning of the KEGG project, KEGG LIGAND consisted of three databases: KEGG COMPOUND for chemical compounds, KEGG REACTION for chemical reactions, and KEGG ENZYME for reactions in the enzyme nomenclature. Currently, there are additional databases: KEGG GLYCAN for glycans and two auxiliary reaction databases called RPAIR (reactant pair alignments) and RCLASS (reaction class). KEGG COMPOUND has also been expanded to contain various compounds such as
xenobiotic A xenobiotic is a chemical substance found within an organism that is not naturally produced or expected to be present within the organism. It can also cover substances that are present in much higher concentrations than are usual. Natural compo ...
s, in addition to metabolites.


Health information

In KEGG, diseases are viewed as perturbed states of the biological system caused by perturbants of genetic factors and environmental factors, and drugs are viewed as different types of perturbants. The KEGG PATHWAY database includes not only the normal states but also the perturbed states of the biological systems. However, disease pathway maps cannot be drawn for most diseases because molecular mechanisms are not well understood. An alternative approach is taken in the KEGG DISEASE database, which simply catalogs known genetic factors and environmental factors of diseases. These catalogs may eventually lead to more complete wiring diagrams of diseases. The KEGG DRUG database contains active ingredients of
approved drug An approved drug is a medicinal preparation that has been validated for a therapeutic use by a ruling authority of a government. This process is usually specific by country, unless specified otherwise. Process by country United States In the ...
s in Japan, the US, and Europe. They are distinguished by chemical structures and/or chemical components and associated with
target Target may refer to: Physical items * Shooting target, used in marksmanship training and various shooting sports ** Bullseye (target), the goal one for which one aims in many of these sports ** Aiming point, in field artillery, fi ...
molecules, metabolizing enzymes, and other molecular interaction network information in the KEGG pathway maps and the BRITE hierarchies. This enables an integrated analysis of drug interactions with genomic information.
Crude drug Crude drugs are plant or animal drugs that contain natural substances that have undergone only the processes of collection and drying. The term natural substances refers to those substances found in nature that have not had man-made changes made i ...
s and other health-related substances, which are outside the category of approved drugs, are stored in the KEGG ENVIRON database. The databases in the health information category are collectively called KEGG MEDICUS, which also includes package inserts of all marketed drugs in Japan.


Subscription model

In July 2011 KEGG introduced a subscription model for FTP download due to a significant cutback of government funding. KEGG continues to be freely available through its website, but the subscription model has raised discussions about sustainability of bioinformatics databases.


See also

* Comparative Toxicogenomics Database - CTD integrates KEGG pathways with toxicogenomic and disease data *
ConsensusPathDB The ConsensusPathDB is a molecular functional interaction database, integrating information on protein interactions, genetic interactions signaling, metabolism, gene regulation, and drug-target interactions in humans. ConsensusPathDB currently (re ...
, a molecular functional interaction database, integrating information from KEGG * Gene ontology * PubMed * Uniprot *
Gene Disease Database In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite inter ...


References


External links


KEGG website

GenomeNet mirror site
* Th
entry for KEGG
in MetaBase {{DEFAULTSORT:Kegg Biological databases Genetic engineering in Japan Online databases Systems biology 21st-century encyclopedias