It's too easy to introduce a worm that gives a spammer access to many teraflops of unwittingly collaborative computing resources. I can't imagine a compute-intensive puzzle scheme is going to do much more than the average iteration of a rule-based anti-spam filter. They'll just provide a temporary dent in the total spam flow. A reliance on new puzzles to provide obstacles to such spammers will end up being very close to homomorphic to rule-based filter iterations. Perhaps even a little less useful, as the spammers will not need to analyze and change each individual bit of spam, but merely need to reload the distributed sending cluster with the new solvers. Microsoft could indeed wipe out spam, in the short and long run. And they can do so without schemes that are likely to end up building upon the substantial plaque that already clogs the arteries of the net. Doug On Fri, 26 Dec 2003, Steven M. Bellovin wrote:
In message <20031226163658.DE74E10DAD@gateway.wvi.com>, "Jeff Shultz" writes:
I'm sure I've heard this one before, so it's not even a new idea... hope whoever came up with it originally patented it. 8-) Then again, maybe it was MS that I heard about the first time, and the Beeb is simply late to the game here.
Yes, puzzles have been suggested before as defenses against SYN floods and SSL DoS attacks, and many other things as well.
Has anyone calculated the increased server load, the extra storage needed for the now lengthened outgoing mail queue, and the extra bandwidth required to handle all this extra back and forth puzzle thing? YahooGroups and the like would definitely be impacted. It would be interesting to see what protections will be built into the puzzle thing as well... I can see some joker setting up his server to require that the sending computer calculate PI to some ridiculous number of decimals... although that might make a good honeypot. Or, if the puzzle is open source (which would be a good thing), how soon before the spammers (or even legit MTA authors) hardcode the answers into the server software? I suppose there would have to be some random elements.
The usual way this is done is to pick a puzzle that's hard to compute but easy to verify. For example, the server could pick a random number, take the top N bits, and challenge the client to find *any* number whose SHA1 hash has the same high-order N bits *and* includes some other random string as the high-order bits of the answer. There are no known short cuts; the only feasible strategy is to calculate lots of SHA1 hashes for different input values. (The server sends some other random number to avoid precomputation attacks.)
Bandwidth is probably not an issue; it's one extra round trip, and it's not very much text. Mail sender queues are more of an issue, as is the load on the sender; if I were doing this, I'd make it adaptive, with a high cost being required for unknown senders, or those that have sent suspected spam. For example, start with a 12-bit puzzle, i.e., one of client difficulty 4096. For each piece of non-spam, subtract some small value from the difficulty. For each piece of spam, double the difficulty rating for that client. There are lots of ways to do things like this; it will take more than back-of-the-envelope calculcations to understand all the knobs, let alone what countermeasures the spammers will deploy.
For an introduction to schemes like this, see Stubblefield, A.., and D. Dean, "Using Client Puzzles to Protect TLS," Proceedings of the Tenth USENIX Security Symposium, Washington, DC, August 2001, available at http://www.csl.sri.com/users/ddean/papers/usenix01b.pdf .
It is interesting.... as an extension it might be nice to be able to set up a "whitelist" of trusted servers that don't have to go through the computational gyrations to send you mail - that way it would, hopefully, eventually impact the spammers more than it would impact legitimate e-mail servers.
According to the article, that is indeed part of the scheme.
--Steve Bellovin, http://www.research.att.com/~smb