Adversarial information retrieval (adversarial IR) is a topic in
information retrieval
Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...
related to strategies for working with a data source where some portion of it has been manipulated maliciously. Tasks can include gathering, indexing, filtering, retrieving and ranking information from such a data source. Adversarial IR includes the study of methods to detect, isolate, and defeat such manipulation.
On the Web, the predominant form of such manipulation is
search engine spamming (also known as spamdexing), which involves employing various techniques to disrupt the activity of
web search engines, usually for financial gain. Examples of spamdexing are
link-bombing,
comment or
referrer spam,
spam blog
A spam blog, also known as an auto blog or the neologism splog, is a blog which the author uses to promote affiliated websites, to increase the search engine rankings of associated sites or to simply sell links/ads.
The purpose of a splog can be ...
s (splogs), malicious tagging.
Reverse engineering
Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompl ...
of
ranking algorithms,
click fraud, and
web content filtering may also be considered forms of adversarial
data manipulation.
Topics
Topics related to Web spam (spamdexing):
*
Link spam
*
Keyword spamming
*
Cloaking
* Malicious tagging
* Spam related to blogs, including
comment spam,
splogs, and
ping spam
Other topics:
*
Click fraud detection
* Reverse engineering of
search engine
A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
's
ranking
A ranking is a relationship between a set of items, often recorded in a list, such that, for any two items, the first is either "ranked higher than", "ranked lower than", or "ranked equal to" the second. In mathematics, this is known as a weak ...
algorithm
* Web
content filtering
*
Advertisement blocking
* Stealth
crawling
*
Troll (Internet)
In slang, a troll is a person who posts deliberately offensive or provocative messages online (such as in social media, a newsgroup, a internet forum, forum, a chat room, an Multiplayer video game, online video game) or who performs similar be ...
* Malicious tagging or voting in
social networks
A social network is a social structure consisting of a set of social actors (such as individuals or organizations), networks of dyadic ties, and other social interactions between actors. The social network perspective provides a set of meth ...
*
Astroturfing
Astroturfing is the deceptive practice of hiding the Sponsor (commercial), sponsors of an orchestrated message or organization (e.g., political, economic, advertising, religious, or public relations) to make it appear as though it originates from ...
*
Sockpuppetry
History
The term "adversarial information retrieval" was first coined in 2000 by
Andrei Broder (then Chief Scientist at
Alta Vista) during the Web plenary session at the
TREC-9 conference.
[D. Hawking and N. Craswell (2004)]
Very Large Scale Retrieval and Web Search (Preprint version)
See also
*
Artificial intelligence content detection
Artificial intelligence detection software aims to determine whether some Content creation, content (text, image, video or audio) was Generative artificial intelligence, generated using artificial intelligence (AI). However, this software is ofte ...
*
Information retrieval
Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...
*
Spamdexing
Spamdexing (also known as search engine spam, search engine poisoning, black-hat search engine optimization, search spam or web spam) is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building ...
References
External links
AIRWeb series of workshops on Adversarial Information Retrieval on the Web
Web Spam Challenge competition for researchers on Web Spam Detection
Web Spam Datasets datasets for research on Web Spam Detection
{{DEFAULTSORT:Adversarial Information Retrieval
Information retrieval genres
Internet fraud