HOME

TheInfoList



OR:

Isearch is open-source
text retrieval Document retrieval is defined as the matching of some stated user query against a set of free-text records. These records could be any type of mainly natural language, unstructured text, such as newspaper articles, real estate records or paragraphs ...
software first developed in 1994 by Nassib Nassar as part of the Isite Z39.50 information framework. The project started at the Clearinghouse for Networked Information Discovery and Retrieval (CNIDR) of the North Carolina supercomputing center MCNC and funded by the
National Science Foundation The National Science Foundation (NSF) is an independent agency of the United States government that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National ...
to follow in the track of WAIS and develop prototype systems for distributed information networks encompassing Internet applications, library catalogs and other information resources. The main features of Isearch include full text and field searching, relevance ranking, Boolean queries, and support for many document types such as HTML, mail folders, list digests, MEDLINE, BibTeX, SGML/XML, FGDC Metadata, NASA DIF, ANZLIC metadata, ISO 19115 metadata and many other resource types and document formats. It was the first search engine to be designed from the ground up to support SGML and Z39.50 search and retrieval. It included many innovations including the "document type" model—which is simply an (object oriented) method of associating each document with a class of functions providing a standard interface for accessing the document. It was one of the first engines (if not the first) to ever support XML. The Isearch search/indexing text algorithms were based on
Gaston Gonnet Gaston H. Gonnet is a Uruguayan Canadian computer scientist and entrepreneur. He is best known for his contributions to the Maple computer algebra system and the creation of a digital version of the Oxford English Dictionary. Education and ear ...
's seminal work into PAT arrays and trees for text retrieval--- ideas that were developed for the New Oxford English Dictionary Project at the Univ. of Waterloo, and provided the seeds for
Tim Bray Timothy William Bray (born June 21, 1955) is a Canadian software developer, environmentalist, political activist and one of the co-authors of the original XML specification. He worked for Amazon Web Services from December 2014 until May 2020 w ...
's PAT SGML engine that formed the basis of
Open Text In semiotic analysis (the studies of signs or symbols), an open text is a text that allows multiple or mediated interpretation by the readers. In contrast, a closed text leads the reader to one intended interpretation. The concept of the open te ...
. One of the limiting factors, however, of the Isearch design was that it was not well suited to handle the extremely large data sets that became popular in the mid to late 1990s. In many cases Isearch was adapted or modified to use different algorithms but usually retained the document type model and the architectural relationship with Isite. Isearch was widely adopted and used in hundreds of public search sites, including many high profile projects such as the
U.S. Patent and Trademark Office The United States Patent and Trademark Office (USPTO) is an agency in the U.S. Department of Commerce that serves as the national patent office and trademark registration authority for the United States. The USPTO's headquarters are in Alexa ...
(USPTO) patent search, the Federal Geographic Data Clearinghouse (FGDC), the NASA Global Change Master Directory, the NASA EOS Guide System, the NASA Catalog Interoperability Project, the astronomical pre-print service based at the
Space Telescope Science Institute The Space Telescope Science Institute (STScI) is the science operations center for the Hubble Space Telescope (HST), science operations and mission operations center for the James Webb Space Telescope (JWST), and science operations center for the ...
, The PCT Electronic Gazette at the
World Intellectual Property Organization The World Intellectual Property Organization (WIPO; french: link=no, Organisation mondiale de la propriété intellectuelle (OMPI)) is one of the 15 specialized agencies of the United Nations (UN). Pursuant to the 1967 Convention Establishi ...
(WIPO), Linsearch (a search engine for Open Source Software designed by Miles Efron), the SAGE Project of the Special Collections Department at Emory University, Eco Companion Australasia (an environmental geospatial resources catalog), Australian National Genomic Information Service (ANGIS), the Open Directory Project and numerous governmental portals in the context of the Government Information Locator Service (GILS) United States Government Printing Office, GPO mandate (ended in 2005?). From 1994 to 1998 most of the development was centered on the Clearinghouse for Networked Information Discovery and Retrieval (CNIDR) in North Carolina (Engine core) and BSn in Germany (Doctypes). By 1998 much of the open-source Isearch core developers re-focused development into several spin-offs. In 1998 it became part of the Advanced Search Facility reference software platform funded by the U.S. Department of Commerce. A/WWW Enterprises now maintains the open source version for public usage, supported by paying government clients, such as the U.S. Patent and Trademark Office, NASA, and the FGDC who have provided support to enhance the functionality and reliability of the software. The software suite is considered a reference implementation of catalog service software. As of 2010, the open source version of Isearch is still used on 250+ nodes of FGDC, and by ANZLIC in Australia and selected Geospatial OneStop contributors to facilitate harvesting by GOS, including NOAA, Census Bureau and the Tenn. Field Office of the US Fish and Wildlife Service, among others.


References


Application of Metadata Concepts to Discovery of Internet ResourcesAn Operational Metadata Framework for Searching, Indexing, and Retrieving Distributed Geographic Information Services on the Internet
* The UNIX Web Server Book, Second Edition, by R. Douglas Matthews et al. (Ventana Press, 1997).



* [https://web.archive.org/web/20081217094635/http://www.uneca.org/awich/AWICH%20Workshop/YaoundeWorkshop/Clearinghouse%20Yaounde.pdf Clearinghouse and Metadata Concepts, Danel Behanu, U.N. Economic Commission for Africa, 2004]
M-98-05 Guidance on the Government Information Locator Service
published by the OMB
01/1995 Press Release: Patent Office Launch Internet AIDS Patent Library
{{refend Free search engine software Internet search engines