An information filtering system is a system that removes
redundant or unwanted
information
Information is an Abstraction, abstract concept that refers to something which has the power Communication, to inform. At the most fundamental level, it pertains to the Interpretation (philosophy), interpretation (perhaps Interpretation (log ...
from an information stream using (semi)automated or computerized methods prior to presentation to a human user. Its main goal is the management of the
information overload and increment of the
semantic
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
signal-to-noise ratio
Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to noise power, often expressed in deci ...
. To do this the user's profile is compared to some reference characteristics. These characteristics may originate from the information item (the content-based approach) or the user's social environment (the
collaborative filtering approach).
Whereas in
information transmission signal processing filters are used against
syntax
In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
-disrupting noise on the bit-level, the methods employed in information filtering act on the semantic level.
The range of machine methods employed builds on the same principles as those for
information extraction. A notable application can be found in the field of email
spam filters. Thus, it is not only the
information explosion that necessitates some form of filters, but also inadvertently or maliciously introduced
pseudo-information.
On the presentation level, information filtering takes the form of user-preferences-based
newsfeeds, etc.
Recommender systems and
content discovery platforms are active information filtering systems that attempt to present to the user information items (
film
A film, also known as a movie or motion picture, is a work of visual art that simulates experiences and otherwise communicates ideas, stories, perceptions, emotions, or atmosphere through the use of moving images that are generally, sinc ...
,
television
Television (TV) is a telecommunication medium for transmitting moving images and sound. Additionally, the term can refer to a physical television set rather than the medium of transmission. Television is a mass medium for advertising, ...
,
music
Music is the arrangement of sound to create some combination of Musical form, form, harmony, melody, rhythm, or otherwise Musical expression, expressive content. Music is generally agreed to be a cultural universal that is present in all hum ...
,
book
A book is a structured presentation of recorded information, primarily verbal and graphical, through a medium. Originally physical, electronic books and audiobooks are now existent. Physical books are objects that contain printed material, ...
s,
news
News is information about current events. This may be provided through many different Media (communication), media: word of mouth, printing, Mail, postal systems, broadcasting, Telecommunications, electronic communication, or through the te ...
,
web page
A web page (or webpage) is a World Wide Web, Web document that is accessed in a web browser. A website typically consists of many web pages hyperlink, linked together under a common domain name. The term "web page" is therefore a metaphor of pap ...
s) the user is interested in. These systems add information items to the information flowing towards the user, as opposed to removing information items from the information flow towards the user. Recommender systems typically use
collaborative filtering approaches or a combination of the collaborative filtering and content-based filtering approaches, although content-based recommender systems do exist.
History
Before the advent of the
Internet
The Internet (or internet) is the Global network, global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a internetworking, network of networks ...
, there are already several methods of filtering information; for instance, governments may control and restrict the flow of information in a given country by means of formal or informal censorship.
On the other hand, we are going to talk about information filters if we refer to newspaper editors and journalists when they provide a service that selects the most valuable information for their clients, readers of books, magazines, newspapers,
radio
Radio is the technology of communicating using radio waves. Radio waves are electromagnetic waves of frequency between 3 hertz (Hz) and 300 gigahertz (GHz). They are generated by an electronic device called a transmitter connec ...
listeners and
TV viewers. This filtering operation is also present in schools and universities where there is a selection of information to provide assistance based on academic criteria to customers of this service, the students. With the advent of the Internet it is possible that anyone can publish anything he wishes at a low-cost. In this way, it increases considerably the less useful information and consequently the quality information is disseminated. With this problem, it began to devise new filtering with which we can get the information required for each specific topic to easily and efficiently.
Operation
A filtering system of this style consists of several tools that help people find the most valuable information, so the limited time you can dedicate to read / listen / view, is correctly directed to the most interesting and valuable documents. These filters are also used to organize and structure information in a correct and understandable way, in addition to group messages on the mail addressed. These filters are essential in the results obtained of the
search engines on the Internet. The functions of filtering improves every day to get downloading Web documents and more efficient messages.
Criterion
One of the criteria used in this step is whether the
knowledge
Knowledge is an Declarative knowledge, awareness of facts, a Knowledge by acquaintance, familiarity with individuals and situations, or a Procedural knowledge, practical skill. Knowledge of facts, also called propositional knowledge, is oft ...
is harmful or not, whether knowledge allows a better understanding with or without the concept. In this case the task of information filtering to reduce or eliminate the harmful information with knowledge.
Learning System
A system of learning content consists, in general rules, mainly of three basic stages:
# First, a system that provides solutions to a defined set of tasks.
# Subsequently, it undergoes assessment criteria which will measure the performance of the previous stage in relation to solutions of problems.
# Acquisition module which its output obtained knowledge that are used in the system solver of the first stage.
Future
Currently the problem is not finding the best way to filter information, but the way that these systems require to learn independently the information needs of users. Not only because they automate the process of
filtering but also the construction and adaptation of the filter. Some branches based on it, such as statistics, machine learning, pattern recognition and data mining, are the base for developing information filters that appear and adapt in base to experience. To carry out the learning process, part of the information has to be pre-filtered, which means there are positive and negative examples which we named training data, which can be generated by experts, or via
feedback
Feedback occurs when outputs of a system are routed back as inputs as part of a chain of cause and effect that forms a circuit or loop. The system can then be said to ''feed back'' into itself. The notion of cause-and-effect has to be handle ...
from ordinary users.
Error
As data is entered, the system includes new rules; if we consider that this data can generalize the training data information, then we have to evaluate the system development and measure the system's ability to correctly predict the categories of new
information
Information is an Abstraction, abstract concept that refers to something which has the power Communication, to inform. At the most fundamental level, it pertains to the Interpretation (philosophy), interpretation (perhaps Interpretation (log ...
. This step is simplified by separating the training data in a new series called "test data" that we will use to measure the error rate. As a general rule it is important to distinguish between types of errors (false positives and false negatives). For example, in the case on an aggregator of content for children, it doesn't have the same gravity to allow the passage of information not suitable for them, that shows violence or pornography, than the mistake to discard some appropriated information.
To improve the system to lower error rates and have these systems with learning capabilities similar to humans we require development of systems that simulate human cognitive abilities, such as
natural-language understanding, capturing meaning Common and other forms of advanced processing to achieve the semantics of information.
Fields of use
Nowadays, there are numerous techniques to develop information filters, some of these reach error rates lower than 10% in various experiments. Among these techniques there are decision trees, support vector machines, neural networks, Bayesian networks, linear discriminants, logistic regression, etc..
At present, these techniques are used in different applications, not only in the web context, but in thematic issues as varied as voice recognition, classification of telescopic astronomy or evaluation of financial risk.
See also
*
*
*
*
*
*
*
*
*
*
{{div col end
References
*Hanani, U., Shapira, B., Shoval, P. (2001) Information filtering: Overview of issues, research and systems. User Modeling and User-Adapted Interaction, 11, pp. 203–259.
* http://www.infoworld.com/d/developer-world/human-information-filter-813
External links
InfoworldIEEXplore
Information systems
Social information processing
Software by type
Filters