HOME

TheInfoList



OR:

The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the
National Institutes of Health The National Institutes of Health, commonly referred to as NIH (with each letter pronounced individually), is the primary agency of the United States government responsible for biomedical and public health research. It was founded in the late ...
(NIH). It is approved and funded by the government of the
United States The United States of America (U.S.A. or USA), commonly known as the United States (U.S. or US) or America, is a country primarily located in North America. It consists of 50 states, a federal district, five major unincorporated territorie ...
. The NCBI is located in
Bethesda, Maryland Bethesda () is an unincorporated, census-designated place in southern Montgomery County, Maryland. It is located just northwest of Washington, D.C. It takes its name from a local church, the Bethesda Meeting House (1820, rebuilt 1849), which in ...
, and was founded in 1988 through legislation sponsored by US Congressman
Claude Pepper Claude Denson Pepper (September 8, 1900 – May 30, 1989) was an American politician of the Democratic Party, and a spokesman for left-liberalism and the elderly. He represented Florida in the United States Senate from 1936 to 1951, and the Mia ...
. The NCBI houses a series of databases relevant to
biotechnology Biotechnology is the integration of natural sciences and engineering sciences in order to achieve the application of organisms, cells, parts thereof and molecular analogues for products and services. The term ''biotechnology'' was first used b ...
and biomedicine and is an important resource for bioinformatics tools and services. Major databases include GenBank for DNA sequences and
PubMed PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine (NLM) at the National Institutes of Health maintain the ...
, a bibliographic database for biomedical literature. Other databases include the
NCBI Epigenomics The Epigenomics database at the National Center for Biotechnology Information was a database for whole-genome epigenetics data sets. It was retired on 1 June 2016. The Epigenomics database The Epigenomics database of the National Center for Biotech ...
database. All these databases are available online through the Entrez search engine. NCBI was directed by
David Lipman David J. Lipman is an American biologist who from 1989 to 2017 was the director of the National Center for Biotechnology Information (NCBI) at the National Institutes of Health. NCBI is the home of GenBank, the U.S. node of the International Sequ ...
, one of the original authors of the BLAST sequence alignment program and a widely respected figure in
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
.


GenBank

NCBI had responsibility for making available the GenBank DNA sequence database since 1992. GenBank coordinates with individual laboratories and other sequence databases, such as those of the
European Molecular Biology Laboratory The European Molecular Biology Laboratory (EMBL) is an intergovernmental organization dedicated to molecular biology research and is supported by 27 member states, two prospect states, and one associate member state. EMBL was created in 1974 and ...
(EMBL) and the
DNA Data Bank of Japan The DNA Data Bank of Japan (DDBJ) is a biological database that collects DNA sequences. It is located at the National Institute of Genetics (NIG) in the Shizuoka prefecture of Japan. It is also a member of the International Nucleotide Sequence Dat ...
(DDBJ). Since 1992, NCBI has grown to provide other databases in addition to GenBank. NCBI provides
Gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
, Online Mendelian Inheritance in Man, the Molecular Modeling Database (3D protein structures), dbSNP (a database of
single-nucleotide polymorphism In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently lar ...
s), the Reference Sequence Collection, a map of the human genome, and a
taxonomy Taxonomy is the practice and science of categorization or classification. A taxonomy (or taxonomical classification) is a scheme of classification, especially a hierarchical classification, in which things are organized into groups or types. ...
browser, and coordinates with the National Cancer Institute to provide the Cancer Genome Anatomy Project. The NCBI assigns a unique identifier (taxonomy ID number) to each species of organism. The NCBI has software tools that are available through internet browsers or by
FTP The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data ...
. For example, BLAST is a sequence similarity searching program. BLAST can do sequence comparisons against the GenBank DNA database in less than 15 seconds.


NCBI Bookshelf

The NCBI Bookshelf is a collection of freely accessible, downloadable, online versions of selected biomedical books. The Bookshelf covers a wide range of topics including
molecular biology Molecular biology is the branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions. The study of chemical and physi ...
,
biochemistry Biochemistry or biological chemistry is the study of chemical processes within and relating to living organisms. A sub-discipline of both chemistry and biology, biochemistry may be divided into three fields: structural biology, enzymology and ...
,
cell biology Cell biology (also cellular biology or cytology) is a branch of biology that studies the structure, function, and behavior of cells. All living organisms are made of cells. A cell is the basic unit of life that is responsible for the living and ...
,
genetics Genetics is the study of genes, genetic variation, and heredity in organisms.Hartl D, Jones E (2005) It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinian friar wor ...
,
microbiology Microbiology () is the scientific study of microorganisms, those being unicellular (single cell), multicellular (cell colony), or acellular (lacking cells). Microbiology encompasses numerous sub-disciplines including virology, bacteriology, prot ...
, disease states from a molecular and cellular point of view, research methods, and
virology Virology is the Scientific method, scientific study of biological viruses. It is a subfield of microbiology that focuses on their detection, structure, classification and evolution, their methods of infection and exploitation of host (biology), ...
. Some of the books are online versions of previously published books, while others, such as ''
Coffee Break A break at work (or work-break) is a period of time during a shift in which an employee is allowed to take time off from their job. It is a type of downtime. There are different types of breaks, and depending on the length and the employer' ...
'', are written and edited by NCBI staff. The Bookshelf is a complement to the Entrez PubMed repository of peer-reviewed publication abstracts in that Bookshelf contents provide established perspectives on evolving areas of study and a context in which many disparate individual pieces of reported research can be organized.


Basic Local Alignment Search Tool (BLAST)

BLAST is an algorithm used for calculating sequence similarity between biological sequences, such as nucleotide sequences of DNA and amino acid sequences of proteins. BLAST is a powerful tool for finding sequences similar to the query sequence within the same organism or in different organisms. It searches the query sequence on NCBI databases and servers and posts the results back to the person's browser in the chosen format. Input sequences to the BLAST are mostly in FASTA or GenBank format while output could be delivered in a variety of formats such as HTML, XML formatting, and plain text. HTML is the default output format for NCBI's web-page. Results for NCBI-BLAST are presented in graphical format with all the hits found, a table with sequence identifiers for the hits having scoring related data, along with the alignments for the sequence of interest and the hits received with analogous BLAST scores for these.


Entrez

The Entrez Global Query Cross-Database Search System is used at NCBI for all the major databases such as Nucleotide and Protein Sequences, Protein Structures, PubMed, Taxonomy, Complete Genomes, OMIM, and several others. Entrez is both an indexing and retrieval system having data from various sources for biomedical research. NCBI distributed the first version of Entrez in 1991, composed of nucleotide sequences from PDB and GenBank, protein sequences from SWISS-PROT, translated GenBank, PIR, PRF, PDB, and associated abstracts and citations from PubMed. Entrez is specially designed to integrate the data from several different sources, databases, and formats into a uniform information model and retrieval system which can efficiently retrieve that relevant references, sequences and structures.


Gene

Gene has been implemented at NCBI to characterize and organize the information about genes. It serves as a major node in the nexus of the genomic map, expression, sequence, protein function, structure, and homology data. A unique GeneID is assigned to each gene record that can be followed through revision cycles.
Gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
records for known or predicted genes are established here and are demarcated by map positions or nucleotide sequences. Gene has several advantages over its predecessor, LocusLink, including, better integration with other databases in NCBI, broader taxonomic scope, and enhanced options for query and retrieval provided by the Entrez system.


Protein

Protein database maintains the text record for individual protein sequences, derived from many different resources such as NCBI Reference Sequence (RefSeq) project, GenBank, PDB, and UniProtKB/SWISS-Prot. Protein records are present in different formats including FASTA and
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
and are linked to other NCBI resources. Protein provides the relevant data to the users such as genes, DNA/RNA sequences, biological pathways, expression and variation data, and literature. It also provides the pre-determined sets of similar and identical proteins for each sequence as computed by the BLAST. The Structure database of NCBI contains 3D coordinate sets for experimentally-determined structures in PDB that are imported by NCBI. The Conserved Domain database ( CDD) of protein contains sequence profiles that characterize highly conserved domains within protein sequences. It also has records from external resources like SMART and
Pfam Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The most recent version, Pfam 35.0, was released in November 2021 and contains 19,632 families. Uses ...
. There is another database of proteins known as Protein Clusters database, which contains sets of proteins sequences that are clustered according to the maximum alignments between the individual sequences as calculated by BLAST.


Pubchem database

PubChem PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of ...
database of NCBI is a public resource for molecules and their activities against biological assays. PubChem is searchable and accessible by Entrez information retrieval system.Wang Y. & Bryant S H. (2014). The NCBI Handbook, 2nd edition, NCBI PubChem BioAssay Database


See also

*
DNA Data Bank of Japan The DNA Data Bank of Japan (DDBJ) is a biological database that collects DNA sequences. It is located at the National Institute of Genetics (NIG) in the Shizuoka prefecture of Japan. It is also a member of the International Nucleotide Sequence Dat ...
(DDBJ) *
European Bioinformatics Institute The European Bioinformatics Institute (EMBL-EBI) is an Intergovernmental Organization (IGO) which, as part of the European Molecular Biology Laboratory (EMBL) family, focuses on research and services in bioinformatics. It is located on the Well ...
(EBI)


References


External links

*
National Library of Medicine

National Institutes of Health
{{Authority control Biotechnology organizations Medical research institutes in Maryland National Institutes of Health Online databases Online taxonomy databases