Adversarial information retrieval (adversarial IR) is a topic in
information retrieval
Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other c ...
related to strategies for working with a data source where some portion of it has been manipulated maliciously. Tasks can include gathering, indexing, filtering, retrieving and ranking information from such a data source. Adversarial IR includes the study of methods to detect, isolate, and defeat such manipulation.
On the Web, the predominant form of such manipulation is
search engine spamming (also known as spamdexing), which involves employing various techniques to disrupt the activity of
web search engines
A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
, usually for financial gain. Examples of spamdexing are
link-bombing,
comment
Comment may refer to:
* Comment (linguistics) or rheme, that which is said about the topic (theme) of a sentence
* Bernard Comment (born 1960), Swiss writer and publisher
Computing
* Comment (computer programming), explanatory text or informat ...
or
referrer spam,
spam blogs (splogs), malicious tagging.
Reverse engineering
Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompli ...
of
ranking algorithms,
advertisement blocking,
click fraud
Click, Klick and Klik may refer to:
Airlines
* Click Airways, a UAE airline
* Clickair, a Spanish airline
* MexicanaClick, a Mexican airline
Art, entertainment, and media Fictional characters
* Klick (fictional species), an alien race in th ...
, and
web content filtering may also be considered forms of adversarial
data manipulation.
Topics
Topics related to Web spam (spamdexing):
*
Link spam
Spamdexing (also known as search engine spam, search engine poisoning, black-hat search engine optimization, search spam or web spam) is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building ...
*
Keyword spamming
*
Cloaking
* Malicious tagging
* Spam related to blogs, including
comment spam,
splogs, and
ping spam
Other topics:
*
Click fraud
Click, Klick and Klik may refer to:
Airlines
* Click Airways, a UAE airline
* Clickair, a Spanish airline
* MexicanaClick, a Mexican airline
Art, entertainment, and media Fictional characters
* Klick (fictional species), an alien race in th ...
detection
* Reverse engineering of
search engine
A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
's
ranking
A ranking is a relationship between a set of items such that, for any two items, the first is either "ranked higher than", "ranked lower than" or "ranked equal to" the second.
In mathematics, this is known as a weak order or total preorder of o ...
algorithm
* Web
content filtering
An Internet filter is software that restricts or controls the content an Internet user is capable to access, especially when utilized to restrict material delivered over the Internet via the Web, Email, or other means. Content-control software dete ...
*
Advertisement blocking
* Stealth
crawling
*
Troll (Internet)
* Malicious tagging or voting in
social networks
A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for a ...
*
Astroturfing
Astroturfing is the practice of masking the sponsors of a message or organization (e.g., political, advertising, religious or public relations) to make it appear as though it originates from and is supported by grassroots participants. It is a ...
*
Sockpuppetry
History
The term "adversarial information retrieval" was first coined in 2000 by
Andrei Broder
Andrei Zary Broder (born April 12, 1953 in Bucharest) is a distinguished scientist at Google. Previously, he was a research fellow and vice president of computational advertising for Yahoo!, and before that, the vice president of research for ...
(then Chief Scientist at
Alta Vista) during the Web plenary session at the
TREC-9 conference.
[D. Hawking and N. Craswell (2004)]
Very Large Scale Retrieval and Web Search (Preprint version)
See also
*
Information retrieval
Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other c ...
*
Spamdexing
Spamdexing (also known as search engine spam, search engine poisoning, black-hat search engine optimization, search spam or web spam) is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building ...
References
External links
AIRWeb series of workshops on Adversarial Information Retrieval on the Web
Web Spam Challenge competition for researchers on Web Spam Detection
Web Spam Datasets datasets for research on Web Spam Detection
{{DEFAULTSORT:Adversarial Information Retrieval
Information retrieval genres
Internet fraud