Paul Vixie wrote:
therefore before you use whole-domain spamtrapping, i recommend looking VERY carefully at the flows so that you can be sure that "i" isn't adjacent to "o" on the qwerty keyboard, or some other such problem.
Agreed. But I'll mention a situation where it's very valuable and show some of the pitfalls found thru intimate experience with doing it. We decommissioned some of our domains about 3 years ago, as we transitioned to our current one. At the time that these domains were decommissioned (de-MX'd), we were catching and tossing 60,000-70,000 spams per day. 18 months later, as an experiment, I re-enabled the domains. First day: 600,000 spams per day. In the months since then it has grown to 2.5 million per day. We use this as a spamtrap. The immediate temptation is to directly feed blacklists of some sort. But: 1) A significant fraction (varies from 5-30%) is NDRs from innocent sites for spam forged with return addresses in our spamtrap. [If <user@domain> is being spammed, chances are that it's being forged in spam too.] 2) A significant fraction is virus/worm attacks from people with very old addresses in their address books. 3) A significant fraction is email from otherwise legitimate sites who have a spam problem (so at least you have to ratio traffic levels between spamtrap and non-spamtrap). Case in point: MSN's DAV servers while they were scriptable (the ratios were absurdly bad (like 5000:1), we did end up blacklisting the DAV sites on-and-off over the past 3 months or so). 4) There is a growing fraction that turn out to be RCPT TO verifications from sendmail configurations (of spam forged with spamtrap domain names). I think this is dangerous (tripping harvesting detectors for example), but, it's a fairly effective heuristic in its own right. 5) There are a significant fraction of "autoresponders" responding to forged spam. While much of this is detectable and you can remove it from the analysis, you generally need to apply additional heuristics beyond just the spamtrap. We try to filter out bounces/viri, compute ratios of bad:good and look for verifications via other third party blacklists that we're unwilling or unable to use directly. It's also fodder for additional analysis that detects open relays, proxies and trojaned boxes.