HOME

TheInfoList



OR:

Human–computer information retrieval (HCIR) is the study and engineering of information retrieval techniques that bring human intelligence into the search process. It combines the fields of human-computer interaction (HCI) and information retrieval (IR) and creates systems that improve search by taking into account the human context, or through a multi-step search process that provides the opportunity for human feedback.


History

This term ''human–computer information retrieval'' was coined by Gary Marchionini in a series of lectures delivered between 2004 and 2006.Marchionini, G. (2006). Toward Human-Computer Information Retrieval Bulletin, in June/July 2006 Bulletin of the American Society for Information Science
/ref> Marchionini's main thesis is that "HCIR aims to empower people to explore large-scale information bases but demands that people also take responsibility for this control by expending cognitive and physical energy." In 1996 and 1998, a pair of workshops at the
University of Glasgow , image = UofG Coat of Arms.png , image_size = 150px , caption = Coat of arms Flag , latin_name = Universitas Glasguensis , motto = la, Via, Veritas, Vita , ...
on information retrieval and
human–computer interaction Human–computer interaction (HCI) is research in the design and the use of computer technology, which focuses on the interfaces between people ( users) and computers. HCI researchers observe the ways humans interact with computers and design ...
sought to address the overlap between these two fields. Marchionini notes the impact of the
World Wide Web The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet. Documents and downloadable media are made available to the network through web se ...
and the sudden increase in
information literacy The Association of College & Research Libraries defines information literacy as a "set of integrated abilities encompassing the reflective discovery of information, the understanding of how information is produced and valued and the use of infor ...
– changes that were only embryonic in the late 1990s. A few workshops have focused on the intersection of IR and HCI. The Workshop on Exploratory Search, initiated by the University of Maryland Human-Computer Interaction Lab in 2005, alternates between the
Association for Computing Machinery The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing society. The ACM is a non-profit professional member ...
Special Interest Group on Information Retrieval SIGIR is the Association for Computing Machinery's Special Interest Group on Information Retrieval. The scope of the group's specialty is the theory and application of computers to the acquisition, organization, storage, retrieval and distribution ...
(SIGIR) and Special Interest Group on Computer-Human Interaction (CHI) conferences. Also in 2005, the
European Science Foundation The European Science Foundation (ESF) is an association of 11 member organizations devoted to scientific research in 8 European countries. ESF is an independent, non-governmental, non-profit organisation that promotes the highest quality science ...
held an Exploratory Workshop on Information Retrieval in Context. Then, the first Workshop on Human Computer Information Retrieval was held in 2007 at the
Massachusetts Institute of Technology The Massachusetts Institute of Technology (MIT) is a Private university, private Land-grant university, land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern t ...
.


Description

HCIR includes various aspects of IR and HCI. These include exploratory search, in which users generally combine querying and browsing strategies to foster learning and investigation; information retrieval in context (i.e., taking into account aspects of the user or environment that are typically not reflected in a query); and interactive information retrieval, which Peter Ingwersen defines as "the interactive communication processes that occur during the retrieval of information by involving all the major participants in information retrieval (IR), i.e. the user, the intermediary, and the IR system." A key concern of HCIR is that IR systems intended for human users be implemented and evaluated in a way that reflects the needs of those users. Most modern IR systems employ a ranked retrieval model, in which the documents are scored based on the
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
of the document's
relevance Relevance is the concept of one topic being connected to another topic in a way that makes it useful to consider the second topic when considering the first. The concept of relevance is studied in many different fields, including cognitive sc ...
to the query. In this model, the system only presents the top-ranked documents to the user. This systems are typically evaluated based on their mean average precision over a set of benchmark queries from organizations like the
Text Retrieval Conference The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or ''tracks.'' It is co-sponsored by the National Institute of Standards and Technology (NIST) an ...
(TREC). Because of its emphasis in using human intelligence in the information retrieval process, HCIR requires different evaluation models – one that combines evaluation of the IR and HCI components of the system. A key area of research in HCIR involves evaluation of these systems. Early work on interactive information retrieval, such as Juergen Koenemann and
Nicholas J. Belkin Nicholas J. Belkin is a professor at the School of Communication and Information at Rutgers University. Among the main themes of his research are digital libraries; information-seeking behaviors; and interaction between humans and information r ...
's 1996 study of different levels of interaction for automatic query reformulation, leverage the standard IR measures of precision and recall but apply them to the results of multiple iterations of user interaction, rather than to a single query response.Koenemann, J. and Belkin, N. J. (1996). A case for interaction: a study of interactive information retrieval behavior and effectiveness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Common Ground (Vancouver, British Columbia, Canada, April 13–18, 1996). M. J. Tauber, Ed. CHI '96. ACM Press, New York, NY, 205-212
/ref> Other HCIR research, such as Pia Borlund's IIR evaluation model, applies a methodology more reminiscent of HCI, focusing on the characteristics of users, the details of experimental design, etc.Borlund, P. (2003). The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. Information Research, 8(3), Paper 152
/ref>


Goals

HCIR researchers have put forth the following goals towards a system where the user has more control in determining relevant results.White, R., Capra, R., Golovchinsky, G., Kules, B., Smith, C., and Tunkelang, D. (2013). Introduction to Special Issue on Human-computer Information Retrieval. Journal of Information Processing and Management 49(5), 1053-1057
/ref> Systems should *no longer only deliver the relevant documents, but must also provide semantic information along with those documents *increase user responsibility as well as control; that is, information systems require human intellectual effort *have flexible architectures so they may evolve and adapt to increasingly more demanding and knowledgeable user bases *aim to be part of information ecology of personal and shared memories and tools rather than discrete standalone services *support the entire information life cycle (from creation to preservation) rather than only the dissemination or use phase *support tuning by end users and especially by information professionals who add value to information resources *be engaging and fun to use In short, information retrieval systems are expected to operate in the way that good libraries do. Systems should help users to bridge the gap between data or information (in the very narrow, granular sense of these terms) and knowledge (processed data or information that provides the context necessary to inform the next iteration of an information seeking process). That is, good libraries provide both the information a patron needs as well as a partner in the learning process — the information professional — to navigate that information, make sense of it, preserve it, and turn it into knowledge (which in turn creates new, more informed information needs).


Techniques

The techniques associated with HCIR emphasize representations of information that use human intelligence to lead the user to relevant results. These techniques also strive to allow users to explore and digest the dataset without penalty, i.e., without expending unnecessary costs of time, mouse clicks, or context shift. Many
search engines A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a l ...
have features that incorporate HCIR techniques. Spelling suggestions and automatic query reformulation provide mechanisms for suggesting potential search paths that can lead the user to relevant results. These suggestions are presented to the user, putting control of selection and interpretation in the user's hands. Faceted search enables users to navigate information hierarchically, going from a category to its sub-categories, but choosing the order in which the categories are presented. This contrasts with traditional taxonomies in which the hierarchy of categories is fixed and unchanging.
Faceted navigation Faceted search is a technique that involves augmenting traditional search techniques with a faceted navigation system, allowing users to narrow down search results by applying multiple filters based on faceted classification of the items. It is so ...
, like taxonomic navigation, guides users by showing them available categories (or facets), but does not require them to browse through a hierarchy that may not precisely suit their needs or way of thinking. Lookahead provides a general approach to penalty-free exploration. For example, various
web applications A web application (or web app) is application software that is accessed using a web browser. Web applications are delivered on the World Wide Web to users with an active network connection. History In earlier computing models like client-serve ...
employ AJAX to automatically complete query terms and suggest popular searches. Another common example of lookahead is the way in which search engines annotate results with summary information about those results, including both static information (e.g., metadata about the objects) and "snippets" of document text that are most pertinent to the words in the search query.
Relevance feedback Relevance feedback is a feature of some information retrieval systems. The idea behind relevance feedback is to take the results that are initially returned from a given query, to gather user feedback, and to use information about whether or not th ...
allows users to guide an IR system by indicating whether particular results are more or less relevant.Rocchio, J. (1971). Relevance feedback in information retrieval. In: Salton, G (ed), The SMART Retrieval System. Summarization and
analytics Analytics is the systematic computational analysis of data or statistics. It is used for the discovery, interpretation, and communication of meaningful patterns in data. It also entails applying data patterns toward effective decision-making. It ...
help users digest the results that come back from the query. Summarization here is intended to encompass any means of aggregating or compressing the query results into a more human-consumable form. Faceted search, described above, is one such form of summarization. Another is clustering, which analyzes a set of documents by grouping similar or co-occurring documents or terms. Clustering allows the results to be partitioned into groups of related documents. For example, a search for "java" might return clusters for
Java (programming language) Java is a high-level, class-based, object-oriented programming language that is designed to have as few implementation dependencies as possible. It is a general-purpose programming language intended to let programmers ''write once, run an ...
,
Java (island) Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
, or Java (coffee). Visual representation of data is also considered a key aspect of HCIR. The representation of summarization or analytics may be displayed as tables, charts, or summaries of aggregated data. Other kinds of
information visualization Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
that allow users access to summary views of search results include tag clouds and
treemapping In information visualization and computing, treemapping is a method for displaying hierarchical data using nested figures, usually rectangles. Treemaps display hierarchical ( tree-structured) data as a set of nested rectangles. Each branch of th ...
.


Related areas

* Exploratory video search * Information foraging


References


External links

* * {{DEFAULTSORT:Human-computer information retrieval Information retrieval genres Human–computer interaction