Email filtering is the processing of
email
Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" meant ...
to organize it according to specified criteria. The term can apply to the intervention of human intelligence, but most often refers to the automatic processing of messages at an
SMTP
The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typical ...
server, possibly applying
anti-spam techniques
Various anti-spam techniques are used to prevent email spam (unsolicited bulk email).
No technique is a complete solution to the spam problem, and each has trade-offs between incorrectly rejecting legitimate email (false positives) as opposed to ...
. Filtering can be applied to incoming emails as well as to outgoing ones.
Depending on the calling environment, email filtering
software
Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work.
At the lowest programming level, executable code consists ...
can reject an item at the initial SMTP connection stage or pass it through unchanged for delivery to the user's mailbox. It is also possible to redirect the message for delivery elsewhere, quarantine it for further checking, modify it or 'tag' it in any other way.
Motivation
Common uses for mail filters include organizing incoming email and removal of
spam
Spam may refer to:
* Spam (food), a canned pork meat product
* Spamming, unsolicited or undesired electronic messages
** Email spam, unsolicited, undesired, or illegal email messages
** Messaging spam, spam targeting users of instant messaging ( ...
and
computer virus
A computer virus is a type of computer program that, when executed, replicates itself by modifying other computer programs and inserting its own code. If this replication succeeds, the affected areas are then said to be "infected" with a compu ...
es. Mailbox providers filter outgoing email to promptly react to spam surges that may result from compromised accounts. A less common use is to
inspect outgoing email at some companies to ensure that employees comply with appropriate policies and laws. Users might also employ a mail filter to prioritize messages, and to sort them into folders based on subject matter or other criteria.
Methods
Mailbox provider
A mailbox provider, mail service provider or, somewhat improperly, email service provider is a provider of email hosting. It implements email servers to send, receive, accept, and store email for other organizations or end users, on their behalf ...
s can also install mail filters in their
mail transfer agent
The mail or post is a system for physically transporting postcards, letters, and parcels. A postal service can be private or public, though many governments place restrictions on private systems. Since the mid-19th century, national postal syst ...
s as a service to all of their customers. Anti-virus, anti-spam, URL filtering, and authentication-based rejections are common filter types.
Corporations often use filters to protect their employees and their
information technology
Information technology (IT) is the use of computers to create, process, store, retrieve, and exchange all kinds of data . and information. IT forms part of information and communications technology (ICT). An information technology system (I ...
assets. A catch-all filter will "catch all" of the emails addressed to the domain that do not exist in the mail server - this can help avoid losing emails due to misspelling.
User
Ancient Egyptian roles
* User (ancient Egyptian official), an ancient Egyptian nomarch (governor) of the Eighth Dynasty
* Useramen, an ancient Egyptian vizier also called "User"
Other uses
* User (computing), a person (or software) using an ...
s, may be able to install separate programs (see links below), or configure filtering as part of their
email program
An email client, email reader or, more formally, message user agent (MUA) or mail user agent is a computer program used to access and manage a user's email.
A web application which provides message management, composition, and reception functio ...
(''email client''). In email programs, users can make personal, "manual" filters that then automatically filter mail according to the chosen criteria.
Inbound and outbound filtering
Mail filters can operate on inbound and outbound email traffic. Inbound email filtering involves scanning messages from the Internet addressed to users protected by the filtering system or for
lawful interception Lawful interception (LI) refers to the facilities in telecommunications and telephone networks that allow law enforcement agencies with court orders or other legal authorization to selectively wiretap individual subscribers. Most countries require ...
. Outbound email filtering involves the reverse - scanning email messages from local users before any potentially harmful messages can be delivered to others on the Internet. One method of outbound email filtering that is commonly used by
Internet service provider
An Internet service provider (ISP) is an organization that provides services for accessing, using, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, non-profit, or otherwise private ...
s is
transparent SMTP proxy
SMTP proxies are specialized mail transfer agents (MTAs) that, similar to other types of proxy servers, pass SMTP sessions through to other MTAs without using the store-and-forward approach of a typical MTA. When an SMTP proxy receives a connectio ...
ing, in which email traffic is intercepted and filtered via a transparent proxy within the network. Outbound filtering can also take place in an
email server
Within the Internet email system, a message transfer agent (MTA), or mail transfer agent, or mail relay is software that transfers electronic mail messages from one computer to another using SMTP. The terms mail server, mail exchanger, and MX host ...
. Many corporations employ
data leak prevention
Data loss prevention (DLP) software detects potential data breaches/data ex-filtration transmissions and prevents them by monitoring, detecting and blocking sensitive data while ''in use'' (endpoint actions), ''in motion'' (network traffic), and ' ...
technology in their outbound
mail servers to prevent the leakage of sensitive information via email.
Customization
Mail filters have varying degrees of configurability. Sometimes they make decisions based on matching a
regular expression
A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or ...
. Other times, code may match keywords in the message body, or perhaps the email address of the sender of the message. More complex
control flow
In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''imper ...
and logic is possible with programming languages; this is typically implemented with a
data-driven programming
In computer programming, data-driven programming is a programming paradigm in which the program statements describe the data to be matched and the processing required rather than defining a sequence of steps to be taken. Standard examples of dat ...
language, such as
procmail
procmail is an email server software component — specifically, a message delivery agent (MDA). It was one of the earliest mail filter programs. It is typically used in Unix-like mail systems, using the mbox and Maildir storage formats.
procm ...
, which specifies conditions to match and actions to take on matching, which may involve further matching. Some more advanced filters, particularly anti-spam filters, use statistical
document classification
Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") ...
techniques such as the
naive Bayes classifier
In statistics, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Baye ...
while others use
natural language processing
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to pro ...
to organize incoming emails. Image filtering can use complex image-analysis algorithms to detect skin-tones and specific body shapes normally associated with pornographic images.
Microsoft Outlook
Microsoft Outlook is a personal information manager software system from Microsoft, available as a part of the Microsoft Office and Microsoft 365 software suites. Though primarily an email client, Outlook also includes such functions as Calen ...
includes user-generated email filters called "rules".
[
]
See also
*
Bayesian spam filtering
Naive Bayes classifiers are a popular statistical technique of e-mail filtering. They typically use bag-of-words features to identify email spam, an approach commonly used in text classification.
Naive Bayes classifiers work by correlating th ...
*
CRM114
The CRM 114 Discriminator is a fictional piece of radio equipment in Stanley Kubrick's film ''Dr. Strangelove'' (1964), the destruction of which prevents the crew of a B-52 from receiving the recall code that would stop them from dropping their ...
*
Information filtering An information filtering system is a system that removes redundant or unwanted information from an information stream using (semi)automated or computerized methods prior to presentation to a human user. Its main goal is the management of the infor ...
*
Markovian discrimination
Within the probability theory Markov model, Markovian discrimination in spam filtering is a method used in CRM114 and other spam filters to model the statistical behaviors of spam and nonspam more accurately than in simple Bayesian methods. A sim ...
*
Outbound Spam Protection
*
Sieve (mail filtering language)
Sieve is a programming language that can be used for email filtering. It owes its creation to the CMU Cyrus Project, creators of Cyrus IMAP server.
The language is not tied to any particular operating system or mail architecture. It requires the u ...
is an RFC standard for describing mail filters
*
White list#Email whitelists
References
External links
*
{{spamming
Communication software
Email
Spam filtering