On Fri, 31 Oct 1997, Patrick Lynch wrote:
wpoison: Traps e-mail web crawlers, but what is to stop it from trapping other web crawlers that altavista, webcrawler, excite, yahoo and other people use?
It uses an anti-robot meta tag: <meta name="ROBOTS" content="NOINDEX, NOFOLLOW"> so the idea is genuine, well written robots will stop after hitting it once, but the address harvestors hopefully have no concept of the above tag and keep hitting. I downloaded it last night and really like the idea.
It boasts that it can provide an almost infinite number of bogus e-mail addresses as well as hyperlinks. (these hyperlinks point directly back to the same page) Why would you want to trap a web crawler on your site, using your bandwidth and resources almost indefinitely?
I thought about this almost immediately. First thing I did was hack in a delay. If they're going to get caught in an infinite loop of bogus addresses, I don't want them "benchmarking" my web server by pounding on it. I also added in a further wrinkle to make the URL's it gives you look a lot more different, so it doesn't appear to be just sending you right back to the same site and script. Have a look at http://fdt.net/cgi-bin/wpoison Note...for real use, it's probably a good idea to not call it wpoison, lest the collectors clue in and ignore URLs with wpoison in them. I have a number of hard links to it, so it can be called by other names...now that I think about it, it might be nice to shuffle those as well.
Deadbolt(tm): This filters out known e-mail spammers, from an automatically update-able lists, provided by E-scrub Technologies. What happens when a majority of ISPs are using a filter like this and a legitimate e-mail address is accidently put in the list? That e-mail address would then be denied by a majority of the ISPs.
I looked at this several months ago. It seemed slow and klunky and a bit more complicated than the average user could handle. I like the idea though. ------------------------------------------------------------------ Jon Lewis <jlewis@fdt.net> | Unsolicited commercial e-mail will Network Administrator | be proof-read for $199/message. Florida Digital Turnpike | ______http://inorganic5.fdt.net/~jlewis/pgp for PGP public key____