Address munging
   HOME

TheInfoList



OR:

Address munging is the practice of disguising an e-mail address to prevent it from being automatically collected by unsolicited bulk e-mail providers. Address munging is intended to disguise an e-mail address in a way that prevents computer software from seeing the real address, or even any address at all, but still allows a human reader to reconstruct the original and contact the author: an email address such as, "no-one@example.com", becomes "no-one at example dot com", for instance. Any e-mail address posted in public is likely to be automatically collected by
computer software Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consist ...
used by bulk emailers (a process known as e-mail address scavenging). Addresses posted on webpages,
Usenet Usenet () is a worldwide distributed discussion system available on computers. It was developed from the general-purpose Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Ellis conceived the idea in 1979, and it wa ...
or
chat rooms The term chat room, or chatroom (and sometimes group chat; abbreviated as GC), is primarily used to describe any form of synchronous conferencing, occasionally even asynchronous conferencing. The term can thus mean any technology, ranging from ...
are particularly vulnerable to this. Private e-mail sent between individuals is highly unlikely to be collected, but e-mail sent to a
mailing list A mailing list is a collection of names and addresses used by an individual or an organization to send material to multiple recipients. The term is often extended to include the people subscribed to such a list, so the group of subscribers is re ...
that is
archive An archive is an accumulation of historical records or materials – in any medium – or the physical facility in which they are located. Archives contain primary source documents that have accumulated over the course of an individual ...
d and made available via the
web Web most often refers to: * Spider web, a silken structure created by the animal * World Wide Web or the Web, an Internet-based hypertext system Web, WEB, or the Web may also refer to: Computing * WEB, a literate programming system created by ...
, or passed onto a
Usenet Usenet () is a worldwide distributed discussion system available on computers. It was developed from the general-purpose Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Ellis conceived the idea in 1979, and it wa ...
news server A news server is a collection of software used to handle Usenet articles. It may also refer to a computer itself which is primarily or solely used for handling Usenet. Access to Usenet is only available through news server providers. Articles and ...
and made public, may eventually be scanned and collected.


Disadvantages

Disguising addresses makes it more difficult for people to send
e-mail Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic (digital) version of, or counterpart to, mail, at a time when "mail" meant ...
to each other. Many see it as an attempt to fix a symptom rather than solving the real problem of
e-mail spam Email spam, also referred to as junk email, spam mail, or simply spam, is unsolicited messages sent in bulk by email (spamming). The name comes from a Monty Python sketch in which the name of the canned pork product Spam is ubiquitous, unavoida ...
, at the expense of causing problems for innocent users. In addition, there are e-mail address harvesters who have found ways to read the munged email addresses. The use of address munging on Usenet is contrary to the recommendations of RFC 1036 governing the format of Usenet posts, which requires a valid e-mail address be supplied in the From: field of the post. In practice, few people follow this recommendation strictly. Disguising e-mail addresses in a systematic manner (for example, user tomain otom) offers little protection. Any impediment reduces the user's willingness to take the extra trouble to email the user. In contrast, well-maintained
e-mail filtering Email filtering is the processing of email to organize it according to specified criteria. The term can apply to the intervention of human intelligence, but most often refers to the automatic processing of messages at an SMTP server, possibly appl ...
on the user's end does not drive away potential correspondents. No spam filter is 100% immune to false positives, however, and the same potential correspondent that would have been deterred by address munging may instead end up wasting time on long letters that will merely disappear into junk mail folders. For commercial entities, maintaining contact forms on web pages rather than publicizing e-mail addresses may be one way to ensure that incoming messages are relatively spam-free yet do not get lost. In conjunction with
CAPTCHA A CAPTCHA ( , a contrived acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart") is a type of challenge–response test used in computing to determine whether the user is human. The term was coined in 2003 b ...
fields, spam on such comment fields can be reduced to effectively zero, except that non-accessibility of CAPTCHAs bring the same deterrent problems as address munging itself.


Alternatives

As an alternative to address munging, there are several "transparent" techniques that allow people to post a valid e-mail address, but still make it difficult for automated recognition and collection of the address: * "Transparent name mangling" involves replacing characters in the address with equivalent HTML references from the
list of XML and HTML character entity references In SGML, HTML and XML documents, the logical constructs known as ''character data'' and ''attribute values'' consist of sequences of characters, in which each character can manifest directly (representing itself), or can be represented by a series ...
, e.g. the '@' gets replaced by either 'U+0040' or '@' and the '.' gets replaced by either 'U+002E' or '.' with the user knowing to take out the dashes. * Posting all or part of the e-mail address as an image, for example, no-oneexample.com, where the at sign is disguised as an image, sometimes with the alternative text specified as "@" to allow copy-and-paste, but while altering the address to remain outside of typical
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
s of spambots. * Using a client-side form with the e-mail address as a CSS3 animated text logo captcha and shrinking it to normal size using inline
CSS Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone technolo ...
. * Posting an e-mail address with the order of characters jumbled and restoring the order using CSS. * Building the link by client-side scripting. * Using client-side scripting to produce a multi key email address encrypter. * Using server-side scripting to run a contact form. An example of munging "user@example.com" via client-side scripting would be: The use of images and scripts for address obfuscation can cause problems for people using
screen reader A screen reader is a form of assistive technology (AT) that renders text and image content as speech or braille output. Screen readers are essential to people who are blind, and are useful to people who are visually impaired, illiterate, or ...
s and users with disabilities, and ignores users of text browsers like
lynx A lynx is a type of wild cat. Lynx may also refer to: Astronomy * Lynx (constellation) * Lynx (Chinese astronomy) * Lynx X-ray Observatory, a NASA-funded mission concept for a next-generation X-ray space observatory Places Canada * Lynx, ...
and w3m, although being transparent means they don't disadvantage non-English speakers that cannot understand the plain text bound to a single language that is part of non-transparent munged addresses or instructions that accompany them. According to a 2003 study by the
Center for Democracy and Technology Centre for Democracy & Technology (CDT) is a Washington, D.C.-based 501(c)(3) nonprofit organisation that advocates for digital rights and freedom of expression. CDT seeks to promote legislation that enables individuals to use the internet for pur ...
, even the simplest "transparent name mangling" of e-mail addresses can be effective."Why Am I Getting All This Spam? Unsolicited Commercial E-mail Research Six Month Report" March 2003.


Examples

Common methods of disguising addresses include: The reserved
top-level domain A top-level domain (TLD) is one of the domains at the highest level in the hierarchical Domain Name System of the Internet after the root domain. The top-level domain names are installed in the root zone of the name space. For all domains in ...
.invalid name invalid is reserved by the Internet Engineering Task Force (IETF) in RFC 2606 (June 1999) as a domain name that may not be installed as a top-level domain in the Domain Name System (DNS) of the Internet.RFC 2606 (BCP 32), ''Reserved Top Lev ...
is appended to ensure that a real e-mail address is not inadvertently generated.


References


See also

*
Internet bot An Internet bot, web robot, robot or simply bot, is a software application that runs automated tasks (scripts) over the Internet, usually with the intent to imitate human activity on the Internet, such as messaging, on a large scale. An Internet b ...
* Netiquette {{DEFAULTSORT:Address Munging Spamming Email Obfuscation