Hashcash is a

proof-of-work system Proof of work (PoW) is a form of cryptographic proof in which one party (the ''prover'') proves to others (the ''verifiers'') that a certain amount of a specific computational effort has been expended. Verifiers can subsequently confirm this ex ...

used to limit

email spam Email spam, also referred to as junk email, spam mail, or simply spam, is unsolicited messages sent in bulk by email (spamming). The name comes from a Monty Python sketch in which the name of the canned pork product Spam is ubiquitous, unavoida ...

and

denial-of-service attack In computing, a denial-of-service attack (DoS attack) is a cyber-attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host connect ...

s, and more recently has become known for its use in

bitcoin Bitcoin ( abbreviation: BTC; sign: ₿) is a decentralized digital currency that can be transferred on the peer-to-peer bitcoin network. Bitcoin transactions are verified by network nodes through cryptography and recorded in a public distr ...

(and other

cryptocurrencies A cryptocurrency, crypto-currency, or crypto is a digital currency designed to work as a medium of exchange through a computer network that is not reliant on any central authority, such as a government or bank A bank is a financial i ...

) as part of the mining algorithm. Hashcash was proposed in 1997 by

Adam Back Adam Back (born July 1970) is a British cryptographer and cypherpunk. He is the CEO of Blockstream, which he co-founded in 2014. He invented Hashcash, which is used in the Bitcoin mining process. Life Back was born in London, England, in July ...

and described more formally in Back's 2002 paper "Hashcash - A Denial of Service Counter-Measure".

Background

The idea "...to require a user to compute a moderately hard, but not intractable function..." was proposed by

Cynthia Dwork Cynthia Dwork (born June 27, 1958) is an American computer scientist at Harvard University, where she is Gordon McKay Professor of Computer Science, Radcliffe Alumnae Professor at the Radcliffe Institute for Advanced Study, and Affiliated Professo ...

and

Moni Naor Moni Naor ( he, מוני נאור) is an Israeli computer scientist, currently a professor at the Weizmann Institute of Science. Naor received his Ph.D. in 1989 at the University of California, Berkeley. His advisor was Manuel Blum. He works i ...

in their 1992 paper "Pricing via Processing or Combatting Junk Mail".

How it works

Hashcash is a cryptographic hash-based proof-of-work algorithm that requires a selectable amount of work to compute, but the proof can be verified efficiently. For email uses, a textual encoding of a hashcash stamp is added to the header of an email to prove the sender has expended a modest amount of CPU time calculating the stamp prior to sending the email. In other words, as the sender has taken a certain amount of time to generate the stamp and send the email, it is unlikely that they are a spammer. The receiver can, at negligible computational cost, verify that the stamp is valid. However, the only known way to find a header with the necessary properties is brute force, trying random values until the answer is found; though testing an individual string is easy, satisfactory answers are rare enough that it will require a substantial number of tries to find the answer. The hypothesis is that spammers, whose business model relies on their ability to send large numbers of emails with very little cost per message, will cease to be profitable if there is even a small cost for each spam they send. Receivers can verify whether a sender made such an investment and use the results to help filter email.

Technical details

The header line looks something like this: X-Hashcash: 1:20:1303030600:anni@cypherspace.org::McMybZIhxKXu57jd:ckvi The header contains: * ''ver'': Hashcash format version, 1 (which supersedes version 0). * ''bits'': Number of "partial pre-image" (zero) bits in the hashed code. * ''date'': The time that the message was sent, in the format . * ''resource'': Resource data string being transmitted, e.g., an IP address or email address. * ''ext'': Extension (optional; ignored in version 1). * ''rand'': String of random characters, encoded in base-64 format. * ''counter'': Binary counter, encoded in base-64 format. The header contains the recipient's email address, the date of the message, and information proving that the required computation has been performed. The presence of the recipient's email address requires that a different header be computed for each recipient. The date allows the recipient to record headers received recently and to ensure that the header is unique to the email message.

Sender's side

The sender prepares a header and appends a counter value initialized to a random number. It then computes the 160-bit

SHA-1 In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecima ...

hash of the header. If the first 20 bits (i.e. the 5 most significant hex digits) of the hash are all zeros, then this is an acceptable header. If not, then the sender increments the counter and tries the hash again. Out of 2¹⁶⁰ possible hash values, there are 2¹⁴⁰ hash values that satisfy this criterion. Thus the chance of randomly selecting a header that will have 20 zeros as the beginning of the hash is 1 in 2²⁰ (approx. 10⁶, or about one in a million). The number of times that the sender needs to try to get a valid hash value is modeled by

geometric distribution In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions: * The probability distribution of the number ''X'' of Bernoulli trials needed to get one success, supported on the set \; * ...

. Hence the sender will on average have to try 2²⁰ values to find a valid header. Given reasonable estimates of the time needed to compute the hash, this would take about one second to find. No more efficient method than this brute force approach is known to find a valid header. A normal user on a desktop PC would not be significantly inconvenienced by the processing time required to generate the Hashcash string. However, spammers would suffer significantly due to the large number of spam messages sent by them.

Recipient's side

Technically the system is implemented with the following steps: * The recipient's computer calculates the 160-bit

hash of the entire string (e.g., ). This takes about two microseconds on a 1 GHz machine, far less time than the time it takes for the rest of the e-mail to be received. If the first 20 bits are not all zero, the hash is invalid. (Later versions may require more bits to be zero as machine processing speeds increase.) * The recipient's computer checks the date in the header (e.g., , which represents the date 8 Apr 2006). If it is not within two days of the current date, it is invalid. (The two-day window compensates for clock skew and network routing time between different systems.) * The recipient's computer checks whether the e-mail address in the hash string matches any of the valid e-mail addresses registered by the recipient, or matches any of the mailing lists to which the recipient is subscribed. If a match is not found, the hash string is invalid. * The recipient's computer inserts the hash string into a database. If the string is already in the database (indicating that an attempt is being made to re-use the hash string), it is invalid. If the hash string passes all of these tests, it is considered a valid hash string. All of these tests take far less time and disk space than receiving the body content of the e-mail.

Required effort

The time needed to compute such a hash collision is

exponential Exponential may refer to any of several mathematical topics related to exponentiation, including: *Exponential function, also: **Matrix exponential, the matrix analogue to the above * Exponential decay, decrease at a rate proportional to value *Exp ...

with the number of zero bits. So additional zero bits can be added (doubling the amount of time needed to compute a hash with each additional zero bit) until it is too expensive for spammers to generate valid header lines. Confirming that the header is valid is much faster and always takes the same amount of time, no matter how many zero bits are required for a valid header, since this requires only a single hashing operation.

Advantages and disadvantages

The Hashcash system has the advantage over

micropayment A micropayment is a financial transaction involving a very small sum of money and usually one that occurs online. A number of micropayment systems were proposed and developed in the mid-to-late 1990s, all of which were ultimately unsuccessful. A s ...

proposals applying to legitimate e-mail that no real money is involved. Neither the sender nor recipient need to pay, thus the administrative issues involved with any micropayment system and moral issues related to charging for e-mail are entirely avoided. On the other hand, as Hashcash requires potentially significant computational resources to be expended on each e-mail being sent, it is somewhat difficult to tune the ideal amount of average time one wishes clients to expend computing a valid header. This can mean sacrificing accessibility from low-end

embedded systems An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' as ...

or else running the risk of hostile hosts not being challenged enough to provide an effective filter from spam. Hashcash is also fairly simple to implement in mail user agents and spam filters. No central server is needed. Hashcash can be incrementally deployed—the extra Hashcash header is ignored when it is received by mail clients that do not understand it. One plausible analysis concluded that only one of the following cases is likely: either non-spam e-mail will get stuck due to lack of processing power of the sender, or spam e-mail is bound to still get through. Examples of each include, respectively, a centralized e-mail topology (like a

mailing list A mailing list is a collection of names and addresses used by an individual or an organization to send material to multiple recipients. The term is often extended to include the people subscribed to such a list, so the group of subscribers is re ...

), in which some server is to send an enormous amount of ''legitimate'' e-mails, and

botnet A botnet is a group of Internet-connected devices, each of which runs one or more bots. Botnets can be used to perform Distributed Denial-of-Service (DDoS) attacks, steal data, send spam, and allow the attacker to access the device and its conn ...

s or cluster farms with which spammers can increase their processing power enormously. Most of these issues may be addressed. E.g., botnets may expire faster because users notice the high CPU load and take counter-measures, and mailing list servers can be registered in white lists on the subscribers' hosts and thus be relieved from the hashcash challenges. Another projected problem is that computers continue to get faster according to

Moore's law Moore's law is the observation that the number of transistors in a dense integrated circuit (IC) doubles about every two years. Moore's law is an observation and projection of a historical trend. Rather than a law of physics, it is an empir ...

. So the difficulty of the calculations required must be increased over time. However, developing countries can be expected to use older hardware, which means that they will find it increasingly difficult to participate in the e-mail system. This also applies to lower-income individuals in developed countries who cannot afford the latest hardware. Like hashcash,

use a hash function as their proof-of-work system. The rise of cryptocurrency has created a demand for ASIC-based mining machines. Although most cryptocurrencies use the

SHA-256 SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA) and first published in 2001. They are built using the Merkle–Damgård construction, from a one-way compression ...

hash function, the same ASIC technology could be used to create hashcash solvers that are three orders of magnitude faster than a consumer CPU, reducing the computational hurdle for spammers.

Applications

Bitcoin mining

In contrast to hashcash in mail applications that relies on recipients to set manually an amount of work intended to ''deter'' malicious senders, the Bitcoin cryptocurrency network employs a different hash-based

proof-of-work Proof of work (PoW) is a form of cryptographic proof in which one party (the ''prover'') proves to others (the ''verifiers'') that a certain amount of a specific computational effort has been expended. Verifiers can subsequently confirm this expe ...

challenge to ''enable'' competitive

Bitcoin mining The bitcoin network is a peer-to-peer payment network that operates on a cryptographic protocol. Users send and receive bitcoins, the units of currency, by broadcasting digitally-signed messages to the network using bitcoin cryptocurrency ...

. A Bitcoin miner runs a computer program that collects unconfirmed transactions from users on the network. Together, these can form a "block" and earn a payment to the miner, but a block is only accepted by the network if its hash meets the network's difficulty target. Thus, as in hashcash, miners must discover by brute force the "nonce" that, when included in the block, results in an acceptable hash. Unlike hashcash, Bitcoin's difficulty target does ''not'' specify a minimum number of leading zeros in the hash. Instead, the hash is interpreted as a (very large) integer, and this integer must be less than the target integer. This is necessary because the Bitcoin network must periodically adjust its difficulty level to maintain an average time of 10 minutes between successive blocks. If only leading zeros were considered, then the difficulty could only be doubled or halved, causing the adjustment to greatly overshoot or undershoot in response to small changes in the average block time. Still, the number of leading zeros in the target serves as a good approximation of the current difficulty. In January 2020
block #614525
had 74 leading zeros.

Spam filters

Hashcash is used as a potential solution for

false positives A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...

with automated spam filtering systems, as legitimate users will rarely be inconvenienced by the extra time it takes to mine a stamp.

SpamAssassin Apache SpamAssassin is a computer program used for e-mail spam filtering. It uses a variety of spam-detection techniques, including DNS and fuzzy checksum techniques, Bayesian filtering, external programs, blacklists and online databases. It i ...

has been able to check for Hashcash stamps since version 2.70, assigning a negative score (i.e. less likely to be spam) for valid, unspent Hashcash stamps. However, although the hashcash plugin is on by default, it still needs to be configured with a list of address patterns that must match against the Hashcash resource field before it will be used.

Email clients

The Penny Post software project on

SourceForge SourceForge is a web service that offers software consumers a centralized online location to control and manage open-source software projects and research business software. It provides source code repository hosting, bug tracking, mirrorin ...

implements Hashcash in the

Mozilla Thunderbird Mozilla Thunderbird is a free and open-source cross-platform email client, personal information manager, news client, RSS and chat client developed by the Mozilla Foundation and operated by subsidiary MZLA Technologies Corporation. The project s ...

email client. The project is named for the historical availability of conventional mailing services that cost the sender just one penny; see

Penny Post The Penny Post is any one of several postal systems in which normal letters could be sent for one penny. Five such schemes existed in the United Kingdom while the United States initiated at least three such simple fixed rate postal arrangements. U ...

for information about such mailing services in history.

Email Postmark

Microsoft also designed and implemented a now deprecated open spec, similar to and yet incompatible with Hashcash, Email Postmark, as part of their Coordinated Spam Reduction Initiative (CSRI). The Microsoft email postmark variant of Hashcash is implemented in the Microsoft mail infrastructure components Exchange, Outlook and Hotmail. The format differences between Hashcash and Microsoft's email postmark is that postmark hashes the body in addition to the recipient, and uses a modified

as the hash function and uses multiple sub-puzzles to reduce proof of work variance.

Blogs

Like e-mail,

blog A blog (a truncation of "weblog") is a discussion or informational website published on the World Wide Web consisting of discrete, often informal diary-style text entries (posts). Posts are typically displayed in reverse chronological order ...

s often fall victim to comment spam. Some blog owners have used hashcash scripts written in the

JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...

language to slow down comment spammers. Some scripts (such as wp-hashcash) claim to implement hashcash but instead depend on JavaScript obfuscation to force the client to generate a matching key; while this does require some processing power, it does not use the hashcash algorithm or hashcash stamps.

Reputation

In a digital marketplace, service providers can use hashcash to build reputation to attract clients. To build reputation, a service provider first selects a public key as its ID, and then discovers by brute force a nonce that, when concatenated to the ID, results in a hash digest with several leading zeros. The more zeros, the higher the reputation.

Intellectual property

Hashcash is not patented, and the reference implementation and most of the other implementations are free software. Hashcash is included or available for many

Linux distribution A Linux distribution (often abbreviated as distro) is an operating system made from a software collection that includes the Linux kernel and, often, a package management system. Linux users usually obtain their operating system by downloading one ...

s. RSA has made IPR statements to the IETF about client-puzzles in the context of an RFC that described client-puzzles (not hashcash). The RFC included hashcash in the title and referenced hashcash, but the mechanism described in it is a known-solution interactive challenge which is more akin to Client-Puzzles; hashcash is non-interactive and therefore does not have a known solution. In any case RSA's IPR statement can not apply to hashcash because hashcash predates (March 1997) the client-puzzles publication (February 1999) and the client-puzzles patent filing US7197639 (February 2000).

Notes

References

* Adam Back, "Hashcash - A Denial of Service Counter-Measure", technical report, August 200
(PDF)
* Ben Laurie and Richard Clayton, "'Proof-of-Work' Proves Not to Work", WEIS 04
(PDF)
* Dwork, C. and Naor, M. (1992) "Pricing via Processing or Combating Junk Mail", Crypto '92, pp. 139–147
(PDF)

External links

Hashcash homepage

] * ttp://www.ietf.org/ietf/IPR/rsa-ipr-draft-jennings-sip-hashcash-00.txt RSA IPR note to the IETF about hashcash(2004) {{Use dmy dates, date=December 2019 Cryptographic protocols Spam filtering Email authentication