HOME

TheInfoList



OR:

Geographic information retrieval (GIR) or geographical information retrieval systems are search tools for searching the Web,
enterprise Enterprise (or the archaic spelling Enterprize) may refer to: Business and economics Brands and enterprises * Enterprise GP Holdings, an energy holding company * Enterprise plc, a UK civil engineering and maintenance company * Enterprise ...
documents, and mobile local search that combine traditional text-based queries with location querying, such as a map or placenames. Like traditional information retrieval systems, GIR systems index text and information from structured and unstructured
document A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" o ...
s, and also augment those indices with geographic information. The development and engineering of GIR systems aims to build systems that can reliably answer queries that include a geographic dimension, such as "What wars were fought in Greece?" or "restaurants in Beirut".
Semantic similarity Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tools ...
and
word-sense disambiguation Word-sense disambiguation (WSD) is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious/automatic but can often come to cons ...
are important components of GIR. To identify place names, GIR systems often rely on
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
or other metadata to associate text
documents A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" or ...
with locations. Such
georeferencing Georeferencing means that the internal coordinate system of a map or aerial photo image can be related to a geographic coordinate system. The relevant coordinate transforms are typically stored within the image file (GeoPDF and GeoTIFF are examples ...
,
geotagging Geotagging, or GeoTagging, is the process of adding geographical identification metadata to various media such as a geotagged photograph or video, websites, SMS messages, QR Codes or RSS feeds and is a form of geospatial metadata. This data ...
, and geoparsing tools often need databases of location names, known as
gazetteers A gazetteer is a geographical index or directory used in conjunction with a map or atlas.Aurousseau, 61. It typically contains information concerning the geographical makeup, social statistics and physical features of a country, region, or c ...
.


GIR architecture

GIR involves extracting and resolving the meaning of locations in unstructured text. This is known as geoparsing. After identifying mentions of places and locations in text, a GIR system indexes this information for search and retrieval. GIR systems can commonly be broken down into the following stages: geoparsing, text and geographic indexing, data storage, geographic relevance ranking with respect to a geographic query and browsing results commonly with a map interface. Some GIR systems separate text indexing from geographic indexing, which enables the use of generic database joins, or multi-stage filtering, and others combine them for efficiency. GIR must manage several forms of uncertainty, including
semantic ambiguity In linguistics, an expression is semantically ambiguous when it can have multiple meanings. The higher the amount of synonyms a word has, the higher the degree of ambiguity. Like other kinds of ambiguity, semantic ambiguities are often clarified by ...
of mentions of places in
natural language In neuropsychology, linguistics, and philosophy of language, a natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation. Natural languag ...
text and position precision.


GIR systems

* MetaCarta created and patented one of the first commercial products to offer GIR capabilities. * Frankenplace: a general-purpose geographic search engine. * Web-a-where


Study & Evaluation

The study of GIR systems has a rich history dating back to the 1970s and possibly earlier. See Ray Larson’s book ''Geographic information retrieval and spatial browsing'' for references to much of the pre-
Web Web most often refers to: * Spider web, a silken structure created by the animal * World Wide Web or the Web, an Internet-based hypertext system Web, WEB, or the Web may also refer to: Computing * WEB, a literate programming system created b ...
literature on GIR. In 2005 the
Cross-Language Evaluation Forum The Conference and Labs of the Evaluation Forum (formerly Cross-Language Evaluation Forum), or CLEF, is an organization promoting research in multilingual information access (currently focusing on European languages). Its specific functions are to ...
added a geographic track, GeoCLEF. GeoCLEF was the first
TREC TREC may refer to: * Techniques de Randonnée Équestre de Compétition or Trec, an equestrian discipline * Text Retrieval Conference, workshops co-sponsored by the National Institute of Standards and Technology (NIST) and the U.S. Department of ...
-style evaluation forum for GIR systems and provided participants a chance to compare systems.


Applications

GIR has many applications in geoweb, neogeography, and mobile local search and has been a focus of many conferences, including the
ESRI Esri (; Environmental Systems Research Institute) is an American multinational geographic information system (GIS) software company. It is best known for its ArcGIS products. With a 43% market share, Esri is the world's leading supplier of GIS ...
Users Conferences and
O'Reilly O'Reilly ( ga, Ó Raghallaigh) is a group of families, ultimately all of Irish Gaelic origin, who were historically the kings of East Bréifne in what is today County Cavan. The clan were part of the Connachta's Uí Briúin Bréifne kindred a ...
’s Where 2.0 conferences.


References


See also

*
Geographic information system A geographic information system (GIS) is a type of database containing geographic data (that is, descriptions of phenomena for which location is relevant), combined with software tools for managing, analyzing, and visualizing those data. In a ...
* Geoparsing * Information retrieval * MetaCarta *
Semantic similarity Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tools ...
*
Search engine (computing) A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called ''hits''. Search engines help to minimize the time requi ...
*
Toponymy Toponymy, toponymics, or toponomastics is the study of '' toponyms'' ( proper names of places, also known as place names and geographic names), including their origins, meanings, usage and types. Toponym is the general term for a proper name o ...
Geographic data and information Information retrieval {{comp-ling-stub