vertical search
   HOME

TheInfoList



OR:

A vertical search engine is distinct from a general
web search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
, in that it focuses on a specific segment of online content. They are also called specialty or topical search engines. The vertical content area may be based on topicality, media type, or genre of content. Common verticals include shopping, the automotive industry, legal information, medical information, scholarly literature, job search and travel. Examples of vertical search engines include the
Library of Congress The Library of Congress (LOC) is the research library that officially serves the United States Congress and is the ''de facto'' national library of the United States. It is the oldest federal cultural institution in the country. The library is ...
,
Mocavo Findmypast is a UK-based online genealogy service owned, since 2007, by British company DC Thomson. The website hosts billions of searchable records of census, directory and historical record information. It originated in 1965 when a group of ge ...
,
Nuroa Nuroa is a vertical real estate search engine that displays real estate offers available on the internet for rental, sale and sharing of property including holiday rentals. History Nuora was founded by Oriol Blasco and Gary Stewart in 2006. In ...
,
Trulia Trulia is an American online real estate marketplace which is a subsidiary of Zillow. It facilitates buyers and renters to find homes and neighborhoods across the United States through recommendations, local insights, and map overlays that offer ...
, and
Yelp Yelp Inc. is an American company that develops the Yelp.com website and the Yelp mobile app, which publish crowd-sourced reviews about businesses. It also operates Yelp Guest Manager, a table reservation service. It is headquartered in San Fra ...
. In contrast to general web search engines, which attempt to
index Index (or its plural form indices) may refer to: Arts, entertainment, and media Fictional entities * Index (''A Certain Magical Index''), a character in the light novel series ''A Certain Magical Index'' * The Index, an item on a Halo megastru ...
large portions of the
World Wide Web The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet. Documents and downloadable media are made available to the network through web se ...
using a
web crawler A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing (''web spid ...
, vertical search engines typically use a
focused crawler A focused crawler is a web crawler that collects Web pages that satisfy some specific property, by carefully prioritizing the crawl frontier and managing the hyperlink exploration process. Some predicates may be based on simple, deterministic and su ...
which attempts to index only relevant web pages to a pre-defined topic or set of topics. Some vertical search sites focus on individual verticals, while other sites include multiple vertical searches within one search engine.


Benefits

Vertical search offers several potential benefits over general search engines: * Greater precision due to limited scope, * Leverage domain knowledge including taxonomies and
ontologies In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
, * Support of specific unique user tasks. Vertical search can be viewed as similar to
enterprise search Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a defined audience. "Enterprise search" is used to describe the software of search information within an ente ...
where the domain of focus is the enterprise, such as a company, government or other organization. In 2013, consumer price comparison websites with integrated vertical search engines such as
FindTheBest Graphiq (formerly FindTheBest) is a semantic technology company that uses artificial intelligence to rapidly create interactive data-driven infographics. Its intent is similar to Wolfram Alpha which is designed to provide users with direct info ...
drew large rounds of venture capital funding, indicating a growth trend for these applications of vertical search technology.


Domain-specific search

Domain-specific verticals focus on a specific topic.
John Battelle John Linwood Battelle (born November 4, 1965) is an entrepreneur, author and journalist. Best known for his work creating media properties, Battelle helped launch ''Wired'' in the 1990s and launched ''The Industry Standard ''during the dot-com bo ...
describes this in his book ''The Search'' (2005):
Domain-specific search solutions focus on one area of knowledge, creating customized search experiences, that because of the domain's limited corpus and clear relationships between concepts, provide extremely relevant results for searchers.
Any general search engine would be indexing all the pages and searches in a breadth-first manner to collect documents. The spidering in domain-specific search engines more efficiently searches a small subset of documents by focusing on a particular set. Spidering accomplished with a reinforcement-learning framework has been found to be three times more efficient than
breadth-first search Breadth-first search (BFS) is an algorithm for searching a tree data structure for a node that satisfies a given property. It starts at the tree root and explores all nodes at the present depth prior to moving on to the nodes at the next de ...
.


DARPA's Memex program

In early 2014, the Defense Advanced Research Projects Agency (
DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adv ...
) released a statement on their website outlining the preliminary details of the "Memex program", which aims at developing new search technologies overcoming some limitations of text-based search. DARPA wants the Memex technology developed in this research to be usable for search engines that can search for information on the
Deep Web The deep web, invisible web, or hidden web are parts of the World Wide Web whose contents are not indexed by standard web search-engine programs. This is in contrast to the "surface web", which is accessible to anyone using the Internet. Co ...
– the part of the Internet that is largely unreachable by commercial search engines like
Google Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
or
Yahoo Yahoo! (, styled yahoo''!'' in its logo) is an American web services provider. It is headquartered in Sunnyvale, California and operated by the namesake company Yahoo! Inc. (2017–present), Yahoo Inc., which is 90% owned by investment funds ma ...
. DARPA's website describes that "The goal is to invent better methods for interacting with and sharing information, so users can quickly and thoroughly organize and search subsets of information relevant to their individual interests". As reported in a 2015 ''
Wired ''Wired'' (stylized as ''WIRED'') is a monthly American magazine, published in print and online editions, that focuses on how emerging technologies affect culture, the economy, and politics. Owned by Condé Nast, it is headquartered in San Fra ...
'' article, the search technology being developed in the Memex program "aims to shine a light on the
dark web The dark web is the World Wide Web content that exists on ''darknets'': overlay networks that use the Internet but require specific software, configurations, or authorization to access. Through the dark web, private computer networks can communi ...
and uncover patterns and relationships in online data to help law enforcement and others track illegal activity". DARPA intends for the program to replace the centralized procedures used by commercial search engines, stating that the "creation of a new domain-specific indexing and search paradigm will provide mechanisms for improved content discovery, information extraction, information retrieval, user collaboration, and extension of current search capabilities to the deep web, the dark web, and nontraditional (e.g. multimedia) content". In their description of the program, DARPA explains the program's name as a tribute to Bush's original Memex invention, which served as an inspiration. In April 2015, it was announced parts of Memex would be open sourced. Modules were available for download.


References

{{Reflist Internet search engines