Human–computer Information Retrieval
   HOME

TheInfoList



OR:

Human–computer information retrieval (HCIR) is the study and engineering of
information retrieval Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other co ...
techniques that bring human intelligence into the
search Searching or search may refer to: Computing technology * Search algorithm, including keyword search ** :Search algorithms * Search and optimization for problem solving in artificial intelligence * Search engine technology, software for findi ...
process. It combines the fields of human-computer interaction (HCI) and information retrieval (IR) and creates systems that improve search by taking into account the human context, or through a multi-step search process that provides the opportunity for human feedback.


History

This term ''human–computer information retrieval'' was coined by
Gary Marchionini Gary Marchionini is an American information scientist and educator at the University of North Carolina at Chapel Hill (1998-present). Work Gary Marchionini is a leader in defining theory of human information interaction and exploratory search ...
in a series of lectures delivered between 2004 and 2006.Marchionini, G. (2006). Toward Human-Computer Information Retrieval Bulletin, in June/July 2006 Bulletin of the American Society for Information Science
/ref> Marchionini's main thesis is that "HCIR aims to empower people to explore large-scale information bases but demands that people also take responsibility for this control by expending cognitive and physical energy." In 1996 and 1998, a pair of workshops at the
University of Glasgow , image = UofG Coat of Arms.png , image_size = 150px , caption = Coat of arms Flag , latin_name = Universitas Glasguensis , motto = la, Via, Veritas, Vita , ...
on
information retrieval Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other co ...
and
human–computer interaction Human–computer interaction (HCI) is research in the design and the use of computer technology, which focuses on the interfaces between people (users) and computers. HCI researchers observe the ways humans interact with computers and design tec ...
sought to address the overlap between these two fields. Marchionini notes the impact of the
World Wide Web The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet. Documents and downloadable media are made available to the network through web se ...
and the sudden increase in
information literacy The Association of College & Research Libraries defines information literacy as a "set of integrated abilities encompassing the reflective discovery of information, the understanding of how information is produced and valued and the use of inform ...
– changes that were only embryonic in the late 1990s. A few workshops have focused on the intersection of IR and HCI. The Workshop on Exploratory Search, initiated by the
University of Maryland Human-Computer Interaction Lab A university () is an institution of higher (or tertiary) education and research which awards academic degrees in several academic disciplines. Universities typically offer both undergraduate and postgraduate programs. In the United States, th ...
in 2005, alternates between the
Association for Computing Machinery The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing society. The ACM is a non-profit professional member ...
Special Interest Group on Information Retrieval SIGIR is the Association for Computing Machinery's Special Interest Group on Information Retrieval. The scope of the group's specialty is the theory and application of computers to the acquisition, organization, storage, retrieval and distributi ...
(SIGIR) and Special Interest Group on Computer-Human Interaction (CHI) conferences. Also in 2005, the
European Science Foundation The European Science Foundation (ESF) is an association of 11 member organizations devoted to scientific research in 8 European countries. ESF is an independent, non-governmental, non-profit organisation that promotes the highest quality science ...
held an Exploratory Workshop on Information Retrieval in Context. Then, the first Workshop on Human Computer Information Retrieval was held in 2007 at the
Massachusetts Institute of Technology The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the ...
.


Description

HCIR includes various aspects of IR and HCI. These include
exploratory search Exploratory search is a specialization of information exploration which represents the activities carried out by searchers who are: * unfamiliar with the domain of their goal (i.e. need to learn about the topic in order to understand how to achieve ...
, in which users generally combine querying and browsing strategies to foster learning and investigation; information retrieval in context (i.e., taking into account aspects of the user or environment that are typically not reflected in a query); and interactive information retrieval, which Peter Ingwersen defines as "the interactive communication processes that occur during the retrieval of information by involving all the major participants in information retrieval (IR), i.e. the user, the intermediary, and the IR system." A key concern of HCIR is that IR systems intended for human users be implemented and evaluated in a way that reflects the needs of those users. Most modern IR systems employ a
ranked A ranking is a relationship between a set of items such that, for any two items, the first is either "ranked higher than", "ranked lower than" or "ranked equal to" the second. In mathematics, this is known as a weak order or total preorder of ...
retrieval model, in which the documents are scored based on the
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
of the document's
relevance Relevance is the concept of one topic being connected to another topic in a way that makes it useful to consider the second topic when considering the first. The concept of relevance is studied in many different fields, including cognitive sci ...
to the query. In this model, the system only presents the top-ranked documents to the user. This systems are typically evaluated based on their
mean average precision Evaluation measures for an information retrieval (IR) system assess how well an index, search engine or database returns results from a collection of resources that satisfy a user's query. They are therefore fundamental to the success of informatio ...
over a set of benchmark queries from organizations like the
Text Retrieval Conference The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or ''tracks.'' It is co-sponsored by the National Institute of Standards and Technology (NIST) an ...
(TREC). Because of its emphasis in using human intelligence in the information retrieval process, HCIR requires different evaluation models – one that combines evaluation of the IR and HCI components of the system. A key area of research in HCIR involves evaluation of these systems. Early work on interactive information retrieval, such as Juergen Koenemann and Nicholas J. Belkin's 1996 study of different levels of interaction for automatic query reformulation, leverage the standard IR measures of
precision Precision, precise or precisely may refer to: Science, and technology, and mathematics Mathematics and computing (general) * Accuracy and precision, measurement deviation from true value and its scatter * Significant figures, the number of digit ...
and
recall Recall may refer to: * Recall (bugle call), a signal to stop * Recall (information retrieval), a statistical measure * ''ReCALL'' (journal), an academic journal about computer-assisted language learning * Recall (memory) * ''Recall'' (Overwatch ...
but apply them to the results of multiple iterations of user interaction, rather than to a single query response.Koenemann, J. and Belkin, N. J. (1996). A case for interaction: a study of interactive information retrieval behavior and effectiveness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Common Ground (Vancouver, British Columbia, Canada, April 13–18, 1996). M. J. Tauber, Ed. CHI '96. ACM Press, New York, NY, 205-212
/ref> Other HCIR research, such as Pia Borlund's IIR evaluation model, applies a methodology more reminiscent of HCI, focusing on the characteristics of users, the details of experimental design, etc.Borlund, P. (2003). The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. Information Research, 8(3), Paper 152
/ref>


Goals

HCIR researchers have put forth the following goals towards a system where the user has more control in determining relevant results.White, R., Capra, R., Golovchinsky, G., Kules, B., Smith, C., and Tunkelang, D. (2013). Introduction to Special Issue on Human-computer Information Retrieval. Journal of Information Processing and Management 49(5), 1053-1057
/ref> Systems should *no longer only deliver the relevant documents, but must also provide semantic information along with those documents *increase user responsibility as well as control; that is, information systems require human intellectual effort *have flexible architectures so they may evolve and adapt to increasingly more demanding and knowledgeable user bases *aim to be part of information ecology of personal and shared memories and tools rather than discrete standalone services *support the entire information life cycle (from creation to preservation) rather than only the dissemination or use phase *support tuning by end users and especially by information professionals who add value to information resources *be engaging and fun to use In short, information retrieval systems are expected to operate in the way that good libraries do. Systems should help users to bridge the gap between data or information (in the very narrow, granular sense of these terms) and knowledge (processed data or information that provides the context necessary to inform the next iteration of an information seeking process). That is, good libraries provide both the information a patron needs as well as a partner in the learning process — the
information professional An information professional or information specialist is someone who collects, records, organises, stores, preserves, retrieves, and disseminates printed or digital information. The service delivered to the client is known as an information service ...
— to navigate that information, make sense of it, preserve it, and turn it into knowledge (which in turn creates new, more informed information needs).


Techniques

The techniques associated with HCIR emphasize representations of information that use human intelligence to lead the user to relevant results. These techniques also strive to allow users to explore and digest the dataset without penalty, i.e., without expending unnecessary costs of time, mouse clicks, or context shift. Many
search engines A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
have features that incorporate HCIR techniques.
Spelling suggestion Spelling suggestion is a feature of many computer software applications used to suggest plausible replacements for words that are likely to have been misspelled. ''Spelling suggestion'' features are commonly included in Internet search engines, wor ...
s and automatic query reformulation provide mechanisms for suggesting potential search paths that can lead the user to relevant results. These suggestions are presented to the user, putting control of selection and interpretation in the user's hands.
Faceted search Faceted search is a technique that involves augmenting traditional search techniques with a faceted navigation system, allowing users to narrow down search results by applying multiple filters based on faceted classification of the items. It is som ...
enables users to navigate information hierarchically, going from a category to its sub-categories, but choosing the order in which the categories are presented. This contrasts with traditional taxonomies in which the hierarchy of categories is fixed and unchanging.
Faceted navigation Faceted search is a technique that involves augmenting traditional search techniques with a faceted navigation system, allowing users to narrow down search results by applying multiple filters based on faceted classification of the items. It is som ...
, like taxonomic navigation, guides users by showing them available categories (or facets), but does not require them to browse through a hierarchy that may not precisely suit their needs or way of thinking. Lookahead provides a general approach to penalty-free exploration. For example, various
web applications A web application (or web app) is application software that is accessed using a web browser. Web applications are delivered on the World Wide Web to users with an active network connection. History In earlier computing models like client-serve ...
employ
AJAX Ajax may refer to: Greek mythology and tragedy * Ajax the Great, a Greek mythological hero, son of King Telamon and Periboea * Ajax the Lesser, a Greek mythological hero, son of Oileus, the king of Locris * ''Ajax'' (play), by the ancient Greek ...
to automatically complete query terms and suggest popular searches. Another common example of lookahead is the way in which search engines annotate results with summary information about those results, including both static information (e.g.,
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
about the objects) and "snippets" of document text that are most pertinent to the words in the search query. Relevance feedback allows users to guide an IR system by indicating whether particular results are more or less relevant.Rocchio, J. (1971). Relevance feedback in information retrieval. In: Salton, G (ed), The SMART Retrieval System. Summarization and
analytics Analytics is the systematic computational analysis of data or statistics. It is used for the discovery, interpretation, and communication of meaningful patterns in data. It also entails applying data patterns toward effective decision-making. It ...
help users digest the results that come back from the query. Summarization here is intended to encompass any means of aggregating or compressing the query results into a more human-consumable form. Faceted search, described above, is one such form of summarization. Another is clustering, which analyzes a set of documents by grouping similar or co-occurring documents or terms. Clustering allows the results to be partitioned into groups of related documents. For example, a search for "java" might return clusters for
Java (programming language) Java is a high-level, class-based, object-oriented programming language that is designed to have as few implementation dependencies as possible. It is a general-purpose programming language intended to let programmers ''write once, run anywh ...
,
Java (island) Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
, or Java (coffee). Visual representation of data is also considered a key aspect of HCIR. The representation of summarization or analytics may be displayed as tables, charts, or summaries of aggregated data. Other kinds of
information visualization Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, a ...
that allow users access to summary views of search results include
tag clouds A tag cloud (also known as a word cloud, wordle or weighted list in visual design) is a visual representation of text data, which is often used to depict keyword metadata on websites, or to visualize free form text. Tags are usually single word ...
and
treemapping In information visualization and computing, treemapping is a method for displaying hierarchical data using nested figures, usually rectangles. Treemaps display hierarchical ( tree-structured) data as a set of nested rectangles. Each branch of ...
.


Related areas

* Exploratory video search *
Information foraging Information foraging is a theory that applies the ideas from optimal foraging theory to understand how human users search for information. The theory is based on the assumption that, when searching for information, humans use "built-in" foraging mec ...


References


External links

* * {{DEFAULTSORT:Human-computer information retrieval Information retrieval genres Human–computer interaction