Adversarial information retrieval
   HOME

TheInfoList



OR:

Adversarial information retrieval (adversarial IR) is a topic in
information retrieval Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other c ...
related to strategies for working with a data source where some portion of it has been manipulated maliciously. Tasks can include gathering, indexing, filtering, retrieving and ranking information from such a data source. Adversarial IR includes the study of methods to detect, isolate, and defeat such manipulation. On the Web, the predominant form of such manipulation is search engine spamming (also known as spamdexing), which involves employing various techniques to disrupt the activity of
web search engines A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
, usually for financial gain. Examples of spamdexing are link-bombing,
comment Comment may refer to: * Comment (linguistics) or rheme, that which is said about the topic (theme) of a sentence * Bernard Comment (born 1960), Swiss writer and publisher Computing * Comment (computer programming), explanatory text or informat ...
or referrer spam, spam blogs (splogs), malicious tagging.
Reverse engineering Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompli ...
of ranking algorithms, advertisement blocking,
click fraud Click, Klick and Klik may refer to: Airlines * Click Airways, a UAE airline * Clickair, a Spanish airline * MexicanaClick, a Mexican airline Art, entertainment, and media Fictional characters * Klick (fictional species), an alien race in th ...
, and web content filtering may also be considered forms of adversarial data manipulation.


Topics

Topics related to Web spam (spamdexing): *
Link spam Spamdexing (also known as search engine spam, search engine poisoning, black-hat search engine optimization, search spam or web spam) is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building ...
* Keyword spamming * Cloaking * Malicious tagging * Spam related to blogs, including comment spam, splogs, and ping spam Other topics: *
Click fraud Click, Klick and Klik may refer to: Airlines * Click Airways, a UAE airline * Clickair, a Spanish airline * MexicanaClick, a Mexican airline Art, entertainment, and media Fictional characters * Klick (fictional species), an alien race in th ...
detection * Reverse engineering of
search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
's
ranking A ranking is a relationship between a set of items such that, for any two items, the first is either "ranked higher than", "ranked lower than" or "ranked equal to" the second. In mathematics, this is known as a weak order or total preorder of o ...
algorithm * Web
content filtering An Internet filter is software that restricts or controls the content an Internet user is capable to access, especially when utilized to restrict material delivered over the Internet via the Web, Email, or other means. Content-control software dete ...
* Advertisement blocking * Stealth crawling * Troll (Internet) * Malicious tagging or voting in
social networks A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for a ...
*
Astroturfing Astroturfing is the practice of masking the sponsors of a message or organization (e.g., political, advertising, religious or public relations) to make it appear as though it originates from and is supported by grassroots participants. It is a ...
* Sockpuppetry


History

The term "adversarial information retrieval" was first coined in 2000 by
Andrei Broder Andrei Zary Broder (born April 12, 1953 in Bucharest) is a distinguished scientist at Google. Previously, he was a research fellow and vice president of computational advertising for Yahoo!, and before that, the vice president of research for ...
(then Chief Scientist at Alta Vista) during the Web plenary session at the TREC-9 conference.D. Hawking and N. Craswell (2004)
Very Large Scale Retrieval and Web Search (Preprint version)


See also

*
Information retrieval Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other c ...
*
Spamdexing Spamdexing (also known as search engine spam, search engine poisoning, black-hat search engine optimization, search spam or web spam) is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building ...


References


External links


AIRWeb
series of workshops on Adversarial Information Retrieval on the Web
Web Spam Challenge
competition for researchers on Web Spam Detection
Web Spam Datasets
datasets for research on Web Spam Detection {{DEFAULTSORT:Adversarial Information Retrieval Information retrieval genres Internet fraud