Bioinformatic Harvester
   HOME

TheInfoList



OR:

The Bioinformatic Harvester was a bioinformatic meta
search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
created by the
European Molecular Biology Laboratory The European Molecular Biology Laboratory (EMBL) is an intergovernmental organization dedicated to molecular biology research and is supported by 27 member states, two prospect states, and one associate member state. EMBL was created in 1974 and ...
and subsequently hosted and further developed by KIT
Karlsruhe Institute of Technology The Karlsruhe Institute of Technology (KIT; german: Karlsruher Institut für Technologie) is a public research university in Karlsruhe, Germany. The institute is a national research center of the Helmholtz Association. KIT was created in 2009 w ...
for
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
s and protein-associated information. Harvester currently works for
human Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, culture, ...
,
mouse A mouse ( : mice) is a small rodent. Characteristically, mice are known to have a pointed snout, small rounded ears, a body-length scaly tail, and a high breeding rate. The best known mouse species is the common house mouse (''Mus musculus' ...
,
rat Rats are various medium-sized, long-tailed rodents. Species of rats are found throughout the order Rodentia, but stereotypical rats are found in the genus ''Rattus''. Other rat genera include ''Neotoma'' ( pack rats), ''Bandicota'' (bandicoot ...
,
zebrafish The zebrafish (''Danio rerio'') is a freshwater fish belonging to the minnow family ( Cyprinidae) of the order Cypriniformes. Native to South Asia, it is a popular aquarium fish, frequently sold under the trade name zebra danio (and thus often ...
,
drosophila ''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many species ...
and
arabidopsis thaliana ''Arabidopsis thaliana'', the thale cress, mouse-ear cress or arabidopsis, is a small flowering plant native to Eurasia and Africa. ''A. thaliana'' is considered a weed; it is found along the shoulders of roads and in disturbed land. A winter a ...
based information. Harvester cross-links >50 popular bioinformatic resources and allows cross searches. Harvester serves tens of thousands of pages every day to scientists and physicians. Since 2014 the service is down.


How Harvester works

Harvester collects information from
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
and gene databases along with information from so called "prediction servers." Prediction server e.g. provide online sequence analysis for a single protein. Harvesters search index is based on the IPI and UniProt protein information collection. The collections consists of: * ~72.000 human, ~57.000 mouse, ~41.000 rat, ~51.000 zebrafish, ~35.000 arabidopsis protein pages, which cross-link ~50 major bioinformatic resources.


Harvester crosslinks several types of information


Text based information

From the following databases: * UniProt, one of the largest protein databases * SOURCE, convenient gene information overview *
Simple Modular Architecture Research Tool Simple Modular Architecture Research Tool (SMART) is a biological database that is used in the identification and analysis of protein domains within protein sequences. SMART uses profile-hidden Markov models built from multiple sequence alignmen ...
(SMART) * SOSUI, predicts transmembrane domains *
PSORT PSORT is a bioinformatics tool used for the prediction of protein localisation sites in cells. It receives the information of an amino acid sequence and its taxon of origin (e.g. Gram-negative bacteria) as inputs. Then it analyses the input seque ...
, predicts protein localisation * HomoloGene, compares proteins from different species * gfp-cdna, protein localisation with fluorescence microscopy *
International Protein Index The International Protein Index (IPI) is a defunct protein database launched in 2001 by the European Bioinformatics Institute (EBI), and closed in 2011. Its purpose was to provide the proteomics community with a resource that enables * accession nu ...
(IPI)


Databases rich in graphical elements

These databases are not collected, but are crosslinked, being displayed via iframes. An iframe is a window within an
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
page for an embedded view of and interactive access to the linked database. Several such iframes are combined on a single Harvester protein page. This allows simultaneous convenient comparison of information from several databases. * NCBI- BLAST, an algorithm for comparing biological sequences from the
NCBI The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It is approved and funded by the government of the United States. The ...
* Ensembl, automatic gene annotation by the EMBL-
EBI Ebrahim Hamedi ( fa, اِبراهیم حامدی, also Romanized as "Ebrāhim Hāmedi"; born 1949), better known by his stage name Ebi (Persian: ), is an Iranian pop singer who first started his career in Tehran, gaining fame as part of a ban ...
and Sanger Institute * FlyBase is a database of model organism ''
Drosophila melanogaster ''Drosophila melanogaster'' is a species of fly (the taxonomic order Diptera) in the family Drosophilidae. The species is often referred to as the fruit fly or lesser fruit fly, or less commonly the "vinegar fly" or "pomace fly". Starting with Ch ...
'' * GoPubMed is a knowledge-based search engine for biomedical texts * iHOP, information hyperlinked over proteins via gene/protein synonyms * Mendelian Inheritance in Man project catalogues all the known diseases * RZPD, German resources Center for genome research in Berlin/Heidelberg *
STRING String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
, Search Tool for the Retrieval of Interacting Genes/Proteins, developed by
EMBL The European Molecular Biology Laboratory (EMBL) is an intergovernmental organization dedicated to molecular biology research and is supported by 27 member states, two prospect states, and one associate member state. EMBL was created in 1974 and ...
, SIB and
UZH The Uzh ( uk, Уж; translit. ''Uzh''; sk, Uh; hu, Ung, pl, Uż) is a river in Ukraine and Slovakia. Its name comes from the ancient west Slavic dialect word ''už'', meaning "Snake", (lat. "Serpentes"). The Uzh is a tributary of the Labore ...
* Zebrafish Information Network
LOCATE
subcellular localization database (mouse)


Access from external application

* Genome browser, working draft assemblies for genomes UCSC *
Google Scholar Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes p ...
* Mitocheck * PolyMeta, meta search engine for Google, Yahoo, MSN, Ask, Exalead, AllTheWeb, GigaBlast


What one can find

Harvester allows a combination of different search terms and single words. Search Examples: * Gene-name: "golga3" * Gene-alias: "ADAP-S ADAS ADHAPS ADPS" (one gene name is sufficient) * Gene-Ontologies: "Enzyme linked receptor protein signaling pathway" * Unigene-Cluster: "Hs.449360" * Go-annotation: "intra-Golgi transport" * Molecular function: "protein kinase binding" * Protein: "Q9NPD3" * Protein domain: "SH2 sar" * Protein Localisation: "endoplasmic reticulum" * Chromosome: "2q31" * Disease relevant: use the word "diseaselink" * Combinations: "golgi diseaselink" (finds all golgi proteins associated with a disease) *
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein. mRNA is ...
: "AL136897" * Word: "Cancer" * Comment: "highly expressed in heart" * Author: "Merkel, Schmidt" * Publication or project: "
cDNA In genetics, complementary DNA (cDNA) is DNA synthesized from a single-stranded RNA (e.g., messenger RNA (mRNA) or microRNA (miRNA)) template in a reaction catalyzed by the enzyme reverse transcriptase. cDNA is often used to express a speci ...
sequencing project"


See also

*
List of academic databases and search engines This article contains a representative list of notable databases and search engines useful in an academic setting for finding and accessing articles in academic journals, institutional repositories, archives, or other collections of scientific and ...
*
Biological database Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including genom ...
s * Entrez *
European Bioinformatics Institute The European Bioinformatics Institute (EMBL-EBI) is an Intergovernmental Organization (IGO) which, as part of the European Molecular Biology Laboratory (EMBL) family, focuses on research and services in bioinformatics. It is located on the Well ...
* Human Protein Reference Database *
Metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
*
Sequence profiling tool A sequence profiling tool in bioinformatics is a type of software that presents information related to a genetic sequence, gene name, or keyword input. Such tools generally take a query such as a DNA, RNA, or protein sequence or ‘keyword’ an ...


Literature

* *


Notes and references


External links

* Bioinformatic Harvester V at KIT
Karlsruhe Institute of Technology The Karlsruhe Institute of Technology (KIT; german: Karlsruher Institut für Technologie) is a public research university in Karlsruhe, Germany. The institute is a national research center of the Helmholtz Association. KIT was created in 2009 w ...
* {{Cite web , url=http://harvester42.fzk.de/ , title=Harvester42 at KIT - integrating 50 general search engines , access-date=2013-01-06 , archive-url=https://archive.today/20130106060017/http://harvester42.fzk.de/ , archive-date=2013-01-06 , url-status=dead Bioinformatics software Biological databases Biology websites Internet search engines Science and technology in Cambridgeshire South Cambridgeshire District