Clustering spam campaigns with fuzzy hashing

Jianxing Chen, Romain Fontugne, Akira Kato, Kensuke Fukuda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)


Identifying spamming botnets is essential to defeat spammers and reduce the harm caused by spam emails. The first step to uncover these botnets is the identification of spam campaigns. Simple methods looking for common identifiers in emails, such as URL or email addresses, are inefficient due to the emergence of obfuscation techniques like URL shortening. In this paper we propose a new method based on fuzzy hashing to cluster spam with common goals into the same spam campaign. Fuzzy hashing allows us to identify emails with similar contents even though usual identifiers are obfuscated. Using the proposed method we process a three year long dataset that consists of 540 thousand spam emails. The efficiency of the proposed method is assessed by inspecting the characteristics of the top 100 campaigns found. Finally, we present typical behaviors of the uncovered spam campaigns and the corresponding botnets.

Original languageEnglish
Title of host publicationAsian Internet Engineering Conference, AINTEC 2014
PublisherAssociation for Computing Machinery
Number of pages8
ISBN (Electronic)9781450332514
Publication statusPublished - 2014 Nov 26
Event10th Asian Internet Engineering Conference, AINTEC 2014 - Chiang Mai, Thailand
Duration: 2014 Nov 262014 Nov 28

Publication series

NameAsian Internet Engineering Conference, AINTEC 2014


Other10th Asian Internet Engineering Conference, AINTEC 2014
CityChiang Mai


  • Botnet
  • Clustering
  • Spam

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Software


Dive into the research topics of 'Clustering spam campaigns with fuzzy hashing'. Together they form a unique fingerprint.

Cite this