Amavis
   HOME

TheInfoList



OR:

Amavis is an open-source content filter for
electronic mail Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic (digital) version of, or counterpart to, mail, at a time when "mail" meant ...
, implementing mail message transfer, decoding, some processing and checking, and interfacing with external content filters to provide protection against spam and
viruses A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Since Dmitri Ivanovsky's ...
and other malware. It can be considered an interface between a mailer ( MTA, Mail Transfer Agent) and one or more content filters. ''Amavis'' can be used to: * detect viruses, spam, banned content types or syntax errors in mail messages * block, tag, redirect (using sub-addressing), or forward mail depending on its content, origin or size * quarantine (and release), or archive mail messages to files, to mailboxes, or to a relational database * sanitize passed messages using an external sanitizer * generate
DKIM DomainKeys Identified Mail (DKIM) is an email authentication method designed to detect forged sender addresses in email (email spoofing), a technique often used in phishing and email spam. DKIM allows the receiver to check that an email claimed ...
signatures * verify DKIM signatures and provide DKIM-based
whitelisting A whitelist, allowlist, or passlist is a mechanism which explicitly allows some identified entities to access a particular privilege, service, mobility, or recognition i.e. it is a list of things allowed when everything is denied by default. It is ...
Notable features: * provides
SNMP Simple Network Management Protocol (SNMP) is an Internet Standard protocol for collecting and organizing information about managed devices on IP networks and for modifying that information to change device behaviour. Devices that typically ...
statistics and status monitoring using an extensive MIB with more than 300 variables * provides structured event log in JSON format *
IPv6 Internet Protocol version 6 (IPv6) is the most recent version of the Internet Protocol (IP), the communications protocol that provides an identification and location system for computers on networks and routes traffic across the Internet. IPv ...
protocol is supported in interfacing, and IPv6 address forms in mail header section * properly honors per-recipient settings even in multi-recipient messages, while scanning a message only once. * supports
international email International email arises from the combined provision of ''internationalized domain names'' (IDN) and ''email address internationalization'' (EAI).Started with: The result is email that contains international characters (characters which do not e ...
(RFC 6530,
SMTPUTF8 The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typica ...
, EAI, IDN) A common mail filtering installation with ''Amavis'' consists of a Postfix as an MTA, SpamAssassin as a spam classifier, and ClamAV as an anti-virus protection, all running under a
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating system. Many other virus scanners (about 30) and some other spam scanners (
CRM114 The CRM 114 Discriminator is a fictional piece of radio equipment in Stanley Kubrick's film ''Dr. Strangelove'' (1964), the destruction of which prevents the crew of a B-52 from receiving the recall code that would stop them from dropping their ...
, DSPAM, Bogofilter) are supported, too, as well as some other MTAs.


Interfacing topology

Three topologies for interfacing with an MTA are supported. The ''amavisd'' process can be sandwiched between two instances of an MTA, yielding a classical after-queue mail filtering setup, or ''amavisd'' can be used as an SMTP proxy filter in a before-queue filtering setup, or the ''amavisd'' process can be consulted to provide mail classification but not to forward a mail message by itself, in which case the consulting client remains in charge of mail forwarding. This last approach is used in a Milter setup (with some limitations), or with a historical client program ''amavisd-submit''. Since version 2.7.0 a before-queue setup is preferred, as it allows for a mail message transfer to be rejected during an SMTP session with a sending client. In an after-queue setup filtering takes place after a mail message has already been received and enqueued by an MTA, in which case a mail filter can no longer reject a message, but can only deliver it (possibly tagged), or discard it, or generate a non-delivery notification, which can cause unwanted
backscatter In physics, backscatter (or backscattering) is the reflection of waves, particles, or signals back to the direction from which they came. It is usually a diffuse reflection due to scattering, as opposed to specular reflection as from a mirror, a ...
in case of bouncing a message with a fake sender address. A disadvantage of a before-queue setup is that it requires resources (CPU, memory) proportional to a current (peak) mail transfer rate, unlike an after-queue setup, where some delay is acceptable and resource usage corresponds to average mail transfer rate. With introduction of an option ''smtpd_proxy_options=speed_adjust'' in Postfix 2.7.0 the resource requirements for a before-queue content filter have been much reduced. In some countries the legislation does not permit mail filtering to discard a mail message once it has been accepted by an MTA, so this rules out an after-queue filtering setup with discarding or quarantining of messages, but leaves a possibility of delivering (possibly tagged) messages, or rejecting them in a before-queue setup (SMTP proxy or milter).


Interfacing protocols

''Amavis'' can receive mail messages from an MTA over one or more sockets of protocol families PF_INET ( IPv4), PF_INET6 (
IPv6 Internet Protocol version 6 (IPv6) is the most recent version of the Internet Protocol (IP), the communications protocol that provides an identification and location system for computers on networks and routes traffic across the Internet. IPv ...
) or PF_LOCAL (
Unix domain socket A Unix domain socket aka UDS or IPC socket (inter-process communication socket) is a data communications endpoint for exchanging data between processes executing on the same host operating system. It is also referred to by its address family AF_UN ...
), via protocols
SMTP The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients ty ...
,
LMTP The Local Mail Transfer Protocol (LMTP) is an alternative to (Extended) Simple Mail Transfer Protocol for situations where the receiving side does not have a mail queue, such as a message transfer agent acting as a message delivery agent. LMTP was ...
, or a simple private protocol AM.PDP can be used with a helper program like ''amavisd-milter'' to interface with milters. On the output side protocols SMTP or LMTP can be used to pass a message to a back-end MTA instance or to an LDA, or a message can be passed to a spawned process over a Unix pipe. When SMTP or LMTP are used, a session can optionally be encrypted using a TLS STARTTLS (RFC 3207) extension to the protocol. SMTP Command Pipelining (RFC 2920) is supported in client and server code.


Interfacing with SpamAssassin

When spam scanning is enabled, a daemon process ''amavisd'' is conceptually very similar to a ''spamd'' process of a SpamAssassin project. In both cases forked child processes call SpamAssassin
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
modules directly, hence their performance is similar. The main difference is in protocols used: ''Amavis'' typically speaks a standard
SMTP The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients ty ...
protocol to an MTA, while in the spamc/spamd case an MTA typically spawns a ''spamc'' program passing a message to it over a Unix pipe, then the ''spamc'' process transfers the message to a ''spamd'' daemon using a private protocol, and ''spamd'' then calls SpamAssassin Perl modules.


Design priorities

Design priorities of the ''amavisd-new'' (from here on just called ''Amavis'') are: reliability, security, adherence to standards, performance, and functionality.


Reliability

With the intention that no mail message could be lost due to unexpected events like I/O failures, resources depletion and unexpected program terminations, the ''amavisd'' program meticulously checks a completion status of every system call and I/O operation. Unexpected events are logged if at all possible, and handled with several layers of event handling. Amavis never takes a responsibility for a mail message delivery away from an MTA: the final success status is reported to an MTA only after the message has been passed on to the back-end MTA instance and reception was confirmed. In case of any fatal failures during processing or transferring of a message, the message being processed just stays in a queue of the front-end MTA instance, to be re-tried later. This approach also covers potential unexpected host failures, crashes of the amavisd process or one of its components. The use of program resources like memory size, file descriptors, disk usage and creation of subprocesses is controlled. Large mail messages are not kept in memory, so the available memory size does not impose a limit on the size of mail messages that can be processed, and memory resources are not wasted unnecessarily.


Security

A great deal of attention is given to security aspects, required by handling potentially malicious, nonstandard or just garbled data in mail messages coming from untrusted sources. The process which is handling mail messages runs with reduced privileges under a dedicated user ID. Optionally it can run
chroot A chroot on Unix and Unix-like operating systems is an operation that changes the apparent root directory for the current running process and its children. A program that is run in such a modified environment cannot name (and therefore normall ...
-ed. Risks of
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memo ...
s and memory allocation bugs is largely avoided by implementing all protocol handling and mail processing in
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
, which handles dynamic memory management transparently. Care is taken that content of processed messages does not inadvertently propagate to the system. Perl provides an additional security safety net with its marking of tainted data originating from the wild, and Amavis is careful to put this Perl feature to good use by avoiding automatic untainting of data (''use re "taint"'') and only untainting it explicitly at strategic points, late in a data flow. ''Amavis'' can use several external programs to enhance its functionality. These are de- archivers, de-
compressors A compressor is a mechanical device that increases the pressure of a gas by reducing its volume. An air compressor is a specific type of gas compressor. Compressors are similar to pumps: both increase the pressure on a fluid and both can trans ...
,
virus A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Since Dmitri Ivanovsk ...
scanners and spam scanners. As these programs are often implemented in languages like C or
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
, there is a potential risk that a mail message passed to one of these programs can cause its failure or even open a security hole. The risk is limited by running these programs as an unprivileged user ID, and possibly chroot-ed. Nevertheless, external programs like unmaintained de-archivers should be avoided. The use of these external programs is configurable, and they can be disabled selectively or as a group (like all decoders or all virus scanners).


Performance

Despite being implemented in an interpreted programming language
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
, Amavis itself is not slow. The good performance of the functionality implemented by Amavis itself (not speaking of external components) is achieved by dealing with data in large chunks (e.g. not line-by-line), by avoiding unnecessary data copying, by optimizing frequently traversed code paths, by using suitable data structures and algorithms, as well as by some low-level optimizations. Bottlenecks are detected during development by profiling code and by benchmarking. Detailed timing report in the log can help recognize bottlenecks in a particular installation. Certain external modules or programs like SpamAssassin or some command-line
virus A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Since Dmitri Ivanovsk ...
scanners can be very slow, and using these would constitute a vast majority of elapsed time and processing resources, making resources used by Amavis itself proportionally quite small. Components like external mail decoders, virus scanners and spam scanners can each be selectively disabled if they are not needed. What remains is functionality implemented by Amavis itself, like transferring mail message from and to an MTA using an
SMTP The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients ty ...
or
LMTP The Local Mail Transfer Protocol (LMTP) is an alternative to (Extended) Simple Mail Transfer Protocol for situations where the receiving side does not have a mail queue, such as a message transfer agent acting as a message delivery agent. LMTP was ...
protocol, checking mail header section validity, checking for banned mail content types, verifying and generating
DKIM DomainKeys Identified Mail (DKIM) is an email authentication method designed to detect forged sender addresses in email (email spoofing), a technique often used in phishing and email spam. DKIM allows the receiver to check that an email claimed ...
signatures. As a consequence, mail processing tasks like DKIM signing and verification (with other mail checking disabled) can be exceptionally fast and can rival implementations in compiled languages. Even full checks using a fast virus scanner but with spam scanning disabled can be surprisingly fast.


Adherence to standards

Implementation of protocols and message structures closely follows a set of applicable standards such as RFC 5322, RFC 5321, RFC 2033, RFC 3207, RFC 2045, RFC 2046, RFC 2047, RFC 3461, RFC 3462, RFC 3463, RFC 3464, RFC 4155, RFC 5965, RFC 6376, RFC 5451, RFC 6008, and RFC 4291. In several cases some functionality was re-implemented in the ''Amavis'' code even though a public ( CPAN)
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
module exists, but lacks attention to detail in following a standard or lacks sufficient checking and handling of errors.


License

Amavis is licensed under a
GPLv2 The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general us ...
license. This applies to the current code, as well as to historical branches. An exception to this are some of the supporting programs (like monitoring and statistics reporting), which are covered by a
New BSD License BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD li ...
.


The project

The project started in 1997 as a Unix
shell Shell may refer to: Architecture and design * Shell (structure), a thin structure ** Concrete shell, a thin shell of concrete, usually with no interior columns or exterior buttresses ** Thin-shell structure Science Biology * Seashell, a hard o ...
script to detect and block e-mail messages containing a
virus A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Since Dmitri Ivanovsk ...
. It was intended to block viruses at the MTA (mail transfer agent) or LDA (local delivery) stage, running on a
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
platform, complementing other virus protection mechanisms running on end-user personal computers. Next the tool was re-implemented as a
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
program, which later evolved into a daemonized process. A dozen of developers took turns during the first five years of the project, developing several variants while keeping a common goal, the project name and some of the development infrastructure. Since December 2008 (until 2018-10-09) the only active branch was officially ''amavisd-new'', which was being developed and maintained by Mark Martinec since March 2002. This was agreed between the developers at the time in a private correspondence: Christian Bricart, Lars Hecking, Hilko Bengen, Rainer Link and Mark Martinec. The project name ''Amavis'' is largely interchangeable with the name of the ''amavisd-new'' branch. Much functionality has been added through the years, like adding protection against spam and other unwanted content, besides the original virus protection. The focus is kept on reliability, security, adherence to standards and performance. A domain ''amavis.org'' in use by the project was registered in 1998 by Christian Bricart, one of the early developers, who is still maintaining the domain name registration. The domain is now entirely dedicated to the only active branch. The project mailing list was moved from
SourceForge SourceForge is a web service that offers software consumers a centralized online location to control and manage open-source software projects and research business software. It provides source code repository hosting, bug tracking, mirroring ...
to amavis.org in March 2011, and is hosted by Ralf Hildebrandt and Patrick Ben Koetter. The project web page and the main distribution site was located at the Jožef Stefan Institute, Ljubljana,
Slovenia Slovenia ( ; sl, Slovenija ), officially the Republic of Slovenia (Slovene: , abbr.: ''RS''), is a country in Central Europe. It is bordered by Italy to the west, Austria to the north, Hungary to the northeast, Croatia to the southeast, an ...
(until the handover in 2018), where most of the development was taking place between years 2002 and 2018.


Change of Project Leaders Announcement

On October 9 of 2018 Mark Martinec announced at the general support and discussion mailing list his retirement from the project and also that Patrick Ben Koetter will continue as new project leader. After that Patrick notified the migration of the source code to a public GitLab repository and his plan for the next steps regarding the project development.


Branches and the project name

Through the history of the project the name of the project or its branches varied somewhat. Initially the spelling of the project name was ''AMaViS'' (A Mail Virus Scanner), introduced by Christian Bricart. With a rewrite to
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
the name of the program was ''Amavis-perl''. Daemonized versions were initially distributed under a name ''amavisd-snapshot'' and then as ''amavisd''. A modular rewrite by Hilko Bengen was called ''Amavis-ng''. In March 2002 the ''amavisd-new'' branch was introduced by Mark Martinec, initially as a patch against ''amavisd-snapshot-20020300''. This later evolved into a self-contained project, which is now the only surviving and actively maintained branch. Nowadays a project name is preferably spelled ''Amavis'' (while the name of the program itself is ''amavisd''). The name ''Amavis'' is now mostly interchangeable with ''amavisd-new''.


History of the project


shell program

* 1997 (original code by Mogens Kjær - Carlsberg Laboratory, modified by Jürgen Quade) initial, not released officially * 1998-01-17 AMaViS 0.1 (Christian Bricart) AMaViS, first official release * 1998-01-28 AMaViS 0.1.1 * 1998-12-08 AMaViS 0.2.0-pre1 * 1999-02-25 AMaViS 0.2.0-pre2 * 1999-03-29 AMaViS 0.2.0-pre3 * 1999-03-31 AMaViS 0.2.0-pre4 * 1999-07-19 AMaViS 0.2.0-pre5 * 1999-07-20 AMaViS 0.2.0-pre6 * 2000-10-31 AMaViS 0.2.1 (Christian Bricart, Rainer Link, Chris Mason)


Perl program

* 2000-01 Amavis-perl (Chris Mason) * 2000-08 Amavis-perl-8 * 2000-12 Amavis-perl-10 * 2001-04 Amavis-perl-11 (split to amavisd) * 2003-03-07 Amavis-0.3.12 (Lars Hecking)


Perl daemon: amavisd

* 2001-01 daemonization (Geoff Winkless) * 2001-04 amavisd-snapshot-20010407 (Lars Hecking) * 2001-07 amavisd-snapshot-20010714 * 2002-03 amavisd-snapshot-20020300 (split to amavisd-new) * 2003-03-03 amavisd-0.1


Perl, modular re-design

(Hilko Bengen) * 2002-03 amavis-ng-0.1 * 2003-03 amavis-ng-0.1.6.2


amavisd-new

(Mark Martinec) * 2002-03-30 amavisd-new, pre-forked, Net::Server * 2002-05-17 * 2002-06-30 packages, SQL lookups * 2002-11-16 integrated - one file * 2002-12-27 * 2003-03-14
LDAP The Lightweight Directory Access Protocol (LDAP ) is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. Directory servi ...
lookups * 2003-06-16 * 2003-08-25 p5 * 2003-11-10 p6 @*_maps * 2004-01-05 p7 * 2004-03-09 p8 * 2004-04-02 p9 * 2004-06-29 p10 * 2004-07-01 2.0 policy banks,
IPv6 address An Internet Protocol Version 6 address (IPv6 address) is a numeric label that is used to identify and locate a network interface of a computer or a network node participating in a computer network using IPv6. IP addresses are included in the ...
formats * 2004-08-15 2.1.0 amavisd-nanny monitoring utility * 2004-09-06 2.1.2 * 2004-11-02 2.2.0 * 2004-12-22 2.2.1 * 2005-04-24 2.3.0 @decoders, per-recipient banning rules * 2005-05-09 2.3.1 * 2005-06-29 2.3.2 * 2005-08-22 2.3.3 * 2006-04-02 2.4.0 DSN in SMTP, %*_by_ccat * 2006-05-08 2.4.1 * 2006-06-27 2.4.2 pen pals, SQL logging and quarantine * 2006-09-30 2.4.3 * 2006-11-20 2.4.4 * 2007-01-30 2.4.5 * 2007-04-23 2.5.0 blocking content categories, rewritten
SMTP The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients ty ...
client * 2007-05-31 2.5.1 amavisd-requeue * 2007-06-27 2.5.2 * 2007-12-12 2.5.3 * 2008-03-12 2.5.4 * 2008-04-23 2.6.0
DKIM DomainKeys Identified Mail (DKIM) is an email authentication method designed to detect forged sender addresses in email (email spoofing), a technique often used in phishing and email spam. DKIM allows the receiver to check that an email claimed ...
, bounce killer, TLS * 2008-06-29 2.6.1 * 2008-12-12 Amavis is amavisd-new * 2008-12-15 2.6.2 * 2009-04-22 2.6.3 support for
CRM114 The CRM 114 Discriminator is a fictional piece of radio equipment in Stanley Kubrick's film ''Dr. Strangelove'' (1964), the destruction of which prevents the crew of a B-52 from receiving the recall code that would stop them from dropping their ...
and DSPAM, truncation * 2009-06-25 2.6.4 monitoring over
SNMP Simple Network Management Protocol (SNMP) is an Internet Standard protocol for collecting and organizing information about managed devices on IP networks and for modifying that information to change device behaviour. Devices that typically ...
* 2010-04-25 2.7.0-pre4 * 2011-02-03 2.7.0-pre14 * 2011-03-07 mailing list moved from SourceForge to amavis.org * 2011-04-07 2.6.5 * 2011-05-19 2.6.6 * 2011-06-01 2.7.0 pre-queue improvements, speedup * 2012-04-29 2.7.1 * 2012-06-30 2.7.2 * 2012-06-30 2.8.0 use
ØMQ ZeroMQ (also spelled ØMQ, 0MQ or ZMQ) is an asynchronous message passing, messaging library, aimed at use in distributed application, distributed or concurrent applications. It provides a message queue, but unlike message-oriented middleware, a ...
instead of BDB, performance optimizations * 2013-04-27 2.8.1-rc1 * 2013-06-28 2.8.1 can use Redis for pen pals storage * 2013-09-04 2.8.2-rc1 (2.8.2 not released) * 2014-05-09 2.9.0 structured log in JSON format,
IP address An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
auto-reputation * 2014-06-27 2.9.1 * 2014-10-22 2.10.0 Internationalized Email (RFC 6530,
SMTPUTF8 The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typica ...
, EAI, IDN) * 2014-10-22 2.10.1 * 2016-04-26 2.11.0 * 2018-10-09 2.11.1 minor updates, just prior to migration to a GitLab repository


See also

*
List of antivirus software A ''list'' is any set of items in a row. List or lists may also refer to: People * List (surname) Organizations * List College, an undergraduate division of the Jewish Theological Seminary of America * SC Germania List, German rugby unio ...
* SpamAssassin, a popular open source spam classifier


References


External links

* {{Perl Free email software Free software programmed in Perl Perl software Spam filtering