HOME

TheInfoList



OR:

RetrievalWare is an enterprise search engine emphasizing
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to pro ...
and
semantic networks A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, ...
which was commercially available from 1992 to 2007 and is especially known for its use by government intelligence agencies.


History

RetrievalWare was initially created by Paul Nelson, Kenneth Clark, and Edwin Addison as part of ConQuest Software. Development began in 1989, but the software was not commercially available on a wide scale until 1992. Early funding was provided by
Rome Laboratory Rome Laboratory (Rome Air Development Center until 1991) is the US "Air Force 'superlab' for command, control, and communications" research and development and is responsible for planning and executing the USAF science and technology program. ...
via a
Small Business Innovation Research The Small Business Innovation Research (or SBIR) program is a U.S. government funding program, coordinated by the Small Business Administration, intended to help certain small businesses conduct research and development (R&D). Funding takes the f ...
grant. On July 6, 1995, ConQuest Software was merged with the NASDAQ company, Excalibur Technologies and the product was rebranded as RetrievalWare. On December 21, 2000, Excalibur Technologies was combined with
Intel Corporation Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 series ...
's Interactive Media Services division to form the
Convera Corporation Convera was formed in December 2000 by the merger of Intel's Interactive Services division and Excalibur Technologies Corporation. Until 2007, Convera's primary focus was the enterprise search market through its flagship product, RetrievalWare, ...
. Finally, on April 9, 2007, the RetrievalWare software and business was purchased by
Fast Search & Transfer Microsoft Development Center Norway (known as Fast Search & Transfer ASA (FAST) before 2010) is a Norwegian company, founded in 1997 and based in Oslo. FAST focused on data search technologies. It had offices located in Germany, Italy, Sri Lanka ...
at which point the product was officially retired.
Microsoft Corporation Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporation producing Software, computer software, consumer electronics, personal computers, and related services headquartered at th ...
continues to maintain the product for its existing customer base. Annual revenues for RetrievalWare peaked in 2001 at around $40 million US dollars.


Use of natural language techniques

RetrievalWare is a relevancy ranking text search system with processing enhancements drawn from the fields of natural language processing (NLP) and
semantic networks A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, ...
. NLP algorithms include dictionary-based
stemming In linguistic morphology and information retrieval, stemming is the process of reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form. The stem need not be identical to the morpholog ...
(also known as
lemmatisation Lemmatisation ( or lemmatization) in linguistics is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word's lemma, or dictionary form. In computational linguistics, lemma ...
) and dictionary-based phrase identification. Semantic networks are used by RetrievalWare to expand the query words entered by the user to related terms with terms weights determined by the distance from the user's original terms. In addition to automatic expansion, a feedback-mode whereby users could choose the meaning of the word before performing the expansion was available. The first semantic networks were built using
WordNet WordNet is a lexical database of semantic relations between words in more than 200 languages. WordNet links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into '' synsets'' with short definition ...
. In addition, RetrievalWare implemented a form of
n-gram In the fields of computational linguistics and probability, an ''n''-gram (sometimes also called Q-gram) is a contiguous sequence of ''n'' items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or b ...
search (branded as APRP - Adaptive Pattern Recognition Processing), designed to search over documents with OCR errors. Query terms are divided into sets of 2-grams which are used to locate similarly matching terms from the
inverted index In computer science, an inverted index (also referred to as a postings list, postings file, or inverted file) is a database index storing a mapping from content, such as words or numbers, to its locations in a table, or in a document or a set of do ...
. The resulting matches are weighted based on similarly measures and then used to search for documents. All of these features were available no later than 1993Site Report for the Text REtrieval Conference by ConQuest Software Inc. (TREC2)
- Find the complete proceeding

/ref> and ConQuest software has claimed that it was the first commercial text-search system to implement these techniques.


Other notable features

Other notable features of RetrievalWare include distributed search servers, synchronizers for indexing external
content management system A content management system (CMS) is computer software used to manage the creation and modification of digital content (content management).''Managing Enterprise Content: A Unified Content Strategy''. Ann Rockley, Pamela Kostur, Steve Manning. New ...
s and
relational database A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
s, a heterogeneous security model, document categorization, real-time document-query matching (profiling), multi-lingual searches (queries containing terms from multiple languages searching for documents containing terms from multiple languages), and cross-lingual searches (queries in one language searching for documents in a different language).


Participation in TREC

RetrievalWare participated in the
Text REtrieval Conference The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or ''tracks.'' It is co-sponsored by the National Institute of Standards and Technology (NIST) an ...
in 1992 (TREC-1), 1993 (TREC-2), and 1995 (TREC-4). In TREC-1 Site Report for the Text REtrieval Conference by ConQuest Software Inc. (TREC-1)
- Find the complete proceeding

/ref> and TREC-4,The Excalibur TREC-4 System, Preparations, and Results
- A PDF version of which can be foun
here
and the complete proceedings can be foun

/ref> the RetrievalWare runs for manually entered queries produced the best results based on the 11-point averages over all search engines which participated in the ''ad hoc'' category where search engines are allowed a single opportunity to process previously unknown queries against an existing database.


References


External links

*
Marketing presentation on RetrievalWare semantic networks and adaptive pattern recognition algorithms
{{DEFAULTSORT:Retrievalware Information retrieval systems