HOME

TheInfoList



OR:

Information Hyperlinked over Proteins (or iHOP) is an online
text mining Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extract ...
service that provides a gene-guided network to access
PubMed PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine (NLM) at the National Institutes of Health maintain t ...
abstracts. The service was established by Robert Hoffmann and
Alfonso Valencia Alfonso Valencia is a Spanish biologist, ICREA Professor, current director of the Life Sciences department at Barcelona Supercomputing Center. and of Spanish National Bioinformatics Institute (INB-ISCIII). From 2015-2018, he was President of the In ...
in 2004. The concept underlying iHOP is that by using
gene In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
s and
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
s as hyperlinks between sentences and abstracts, the information in PubMed can be converted into one navigable resource. Navigating across interrelated sentences within this network rather than the use of conventional keyword searches allows for stepwise and controlled acquisition of information. Moreover, this literature network can be superimposed upon experimental interaction data to facilitate the simultaneous analysis of novel and existing knowledge. As of September 2014, the network presented in iHOP contains 28.4 million sentences and 110,000 genes from over 2,700 organisms, including the model organisms ''
Homo sapiens Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, culture, ...
'', ''
Mus musculus Mus or MUS may refer to: Abbreviations * MUS, the NATO country code for Mauritius * MUS, the IATA airport code for Minami Torishima Airport * MUS, abbreviation for the Centre for Modern Urban Studies on Campus The Hague, Leiden University, Net ...
'', ''
Drosophila melanogaster ''Drosophila melanogaster'' is a species of fly (the taxonomic order Diptera) in the family Drosophilidae. The species is often referred to as the fruit fly or lesser fruit fly, or less commonly the " vinegar fly" or "pomace fly". Starting with ...
'', '' Caenorhabditis elegans'', ''
Danio rerio The zebrafish (''Danio rerio'') is a freshwater fish belonging to the minnow family (Cyprinidae) of the order Cypriniformes. Native to South Asia, it is a popular aquarium fish, frequently sold under the trade name zebra danio (and thus often ca ...
'', '' Arabidopsis thaliana'', ''
Saccharomyces cerevisiae ''Saccharomyces cerevisiae'' () (brewer's yeast or baker's yeast) is a species of yeast (single-celled fungus microorganisms). The species has been instrumental in winemaking, baking, and brewing since ancient times. It is believed to have b ...
'' and ''
Escherichia coli ''Escherichia coli'' (),Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. also known as ''E. coli'' (), is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus '' Esc ...
''. The iHOP system has shown that by navigating from gene to gene, distant medical and biological concepts may be connected by only a small number of genes; the shortest path between two genes has been shown to involve on average four intermediary genes. The iHOP system architecture consists of two separate parts: the 'iHOP factory' and the web application. The iHOP factory manages the PubMed source data (text and gene data) and organises it within a PostgreSQL relational database. The iHOP factory also produces the relevant
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
output for display by the web application. iHOP is free to use and is licensed under a Creative Commons BY-ND license.


References

{{reflist


External links


iHOP server
Bioinformatics Medical search engines