HOME

TheInfoList



OR:

Image-based spam,Giorgio Fumera, Ignazio Pillai, Fabio Roli, Journal of Machine Learning Research (special issue on Machine Learning in Computer Security), vol. 7, pp. 2699-2720, 12/2006.Battista Biggio, Giorgio Fumera, Ignazio Pillai, Fabio Roli, Volume 32, Issue 10, 15 July 2011, Pages 1436-1446, ISSN 0167-8655. or image spam, is a kind of
email spam Email spam, also referred to as junk email, spam mail, or simply spam, refers to unsolicited messages sent in bulk via email. The term originates from a Spam (Monty Python), Monty Python sketch, where the name of a canned meat product, "Spam (food ...
where the textual spam message is embedded into images, that are then attached to spam emails. Since most of the email clients will display the
image file An image file format is a file format for a digital image. There are many formats that can be used, such as JPEG, PNG, and GIF. Most formats up until 2022 were for storing 2D images, not 3D ones. The data stored in an image file format may be c ...
directly to the user, the spam message is conveyed as soon as the email is opened (there is no need to further open the attached image file).


Technique

The goal of image spam is clearly to circumvent the analysis of the email’s textual content performed by most
spam filter Email filtering is the processing of email to organize it according to specified criteria. The term can apply to the intervention of human intelligence, but most often refers to the automatic processing of messages at an SMTP server, possibly ap ...
s (e.g., SpamAssassin, RadicalSpam, Bogofilter, SpamBayes). Accordingly, for the same reason, together with the attached image, often spammers add some “bogus” text to the email, namely, a number of words that are most likely to appear in legitimate emails and not in spam. The earlier image spam emails contained spam images in which the text was clean and easily readable, as shown in Fig. 1.


Detection

Consequently,
optical character recognition Optical character recognition or optical character reader (OCR) is the electronics, electronic or machine, mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo ...
tools were used to extract the text embedded into spam images, which could be then processed together with the text in the email’s body by the spam filter, or, more generally, by more sophisticated text categorization techniques. Further, signatures (e.g., MD5 hashing) were also generated to easily detected and block already known spam images. Spammers in turn reacted by applying some
obfuscation Obfuscation is the obscuring of the intended meaning of communication by making the message difficult to understand, usually with confusing and ambiguous language. The obfuscation might be either unintentional or intentional (although intent ...
techniques to spam images, similarly to
CAPTCHA Completely Automated Public Turing Test to tell Computers and Humans Apart (CAPTCHA) ( ) is a type of challenge–response authentication, challenge–response turing test used in computing to determine whether the user is human in order to de ...
s, both to prevent the embedded text to be read by OCR tools, and to mislead signature-based detection. Some examples are shown in Fig. 2. This raised the issue of improving image spam detection using computer vision and pattern recognition techniques.Aradhye, H., Myers, G., Herson, J. A., 2005. Image analysis for efficient cat egorization of image-based spam e-mail. In: Proc. Int. Conf. on Document Analysis and Recognition, pp. 914–918.Dredze, M., Gevaryahu, R., Elias-Bachrach, A., 2007. Learning fast classifiers for image spam. In: Proc. 4th Conf. on Email and Anti-Spam (CEAS) In particular, several authors investigated the possibility of recognizing image spam with obfuscated images by using generic low-level image features (like number of colours, prevalent colour coverage, image aspect ratio, text area), image metadata, etc.Wu, C.-T., Cheng, K.-T., Zhu, Q., Wu, Y.-L., 2005. Using visual features for anti-spam filtering. In: Proc. IEEE Int. Conf. on Image Processing, Vol. III.pp. 501–504.Liu, Q., Qin, Z., Cheng, H., Wan, M., 2010. Efficient modeling of spam images. In: Int. Symp. on Intelligent Information Technology and Security Informatics. IEEE Computer Society, pp. 663–666. (see for a comprehensive survey). Notably, some authors also tried detecting the presence of text in attached images with artifacts denoting an adversarial attempt to obfuscate it.Battista Biggio, Giorgio Fumera, Ignazio Pillai, Fabio Roli ,
Image Spam Filtering Using Visual Information
, 14th Int. Conf. on Image Analysis and Processing (ICIAP 2007), Modena, Italy, IEEE Computer Society, pp. 105--110, 10/09/2007.
Fabio Roli, Battista Biggio, Giorgio Fumera, Ignazio Pillai, Riccardo Satta , "Image Spam Filtering by Detection of Adversarial Obfuscated Text", Workshop on Neural Information Processing Systems (NIPS), Whistler, British Columbia, Canada, 08/12/2007.Battista Biggio, Giorgio Fumera, Ignazio Pillai, Fabio Roli , "Improving Image Spam Filtering Using Image Text Features", Fifth Conference on Email and Anti-Spam (CEAS 2008), Mountain View, CA, USA, 21/08/2008.


History

Image spam started in 2004 and peaked at the end of 2006, when over 50% of spam was image spam. In mid-2007, it started declining, and practically disappeared in 2008.IBM X-Force® 2010, Mid-Year Trend and Risk Report (August 2010). The reason behind this phenomenon is not easy to understand. The decline of image spam can probably be attributed both to the improvement of the proposed countermeasures (e.g., fast image spam detectors based on visual features), and to the higher requirements in terms of bandwidth of image spam that force spammers to send a smaller amount of spam over a given time interval. Both factors might have made image spam less convenient for spammers than other kinds of spam. Nevertheless, at the end of 2011 a rebirth of image spam was detected, and image spam reached 8% of all spam traffic, albeit for a small period.IBM X-Force® 2012, Mid-Year Trend and Risk Report (September 2012).


See also

*
Anti-spam techniques Various anti-spam techniques are used to prevent email spam (unsolicited bulk email). No technique is a complete solution to the spam problem, and each has trade-offs between incorrectly rejecting legitimate email (false positives) as opposed t ...
*
Email spam Email spam, also referred to as junk email, spam mail, or simply spam, refers to unsolicited messages sent in bulk via email. The term originates from a Spam (Monty Python), Monty Python sketch, where the name of a canned meat product, "Spam (food ...


References

{{Spamming Spamming Email