Steven Champeon wrote:
on Wed, Dec 01, 2004 at 03:34:43PM -0500, Valdis.Kletnieks@vt.edu wrote:
On Wed, 01 Dec 2004 15:02:19 EST, Steven Champeon said:
Connect:dhcp.vt.edu ERROR:5.7.1:"550 go away, dynamic user"
Given the number of options available at our end, I can hardly blame other sites for considering this a reasonable rule - I can't think of a scenario we can't fix at our end, as long as the user bothers calling our help desk and asks for help fixing it...
Exactly. That's why rDNS has been so useful for us. We can either whitelist exceptions (such as customers of ISPs who have sucky customer service and technical support) or try to educate them. It's (generally) easy to change, it requires static assignment in order to work properly, as an indication of the purpose(s) to which a given IP is put, etc.
Instead of having 6936 regexp patterns to match and parse one gazillion different reverse DNS encodings you could simply mark the reverse DNS entries of IP addresses that are actually *supposed* to be mail servers. Reverse zone file for 10.0.0.0/24: 1.0.0.10.in-addr.arpa. IN PTR mail.example.com. _send._smtp._srv.1.0.0.10.in-addr.arpa. IN TXT "1" About as simple as it gets. And much easier than figuring out for 99% of all IP addresses that they are not supposed to send mail directly. Just turn the tables and tag those that are mail servers. And it allows for a nice and graceful transition too. Nicely described here: ftp://ftp.rfc-editor.org/in-notes/internet-drafts/draft-stumpf-dns-mtamark-03.txt -- Andre
(On the other hand, anybody who's filtering certain address blocks because they're our DHCP blocks deserves to be shot, for all the usual reasons and then some..)
Sure, but I can certainly understand why, for example, someone might block all of AOL's dynamic blocks port 25, at least. Or Charter's. Or Cox's, or any of the other sources of massive and constant abuse.
Wouldn't catch 1.2.3.4.dhcp.vt.edu.example.com anyway.
Yeah, but that has 'dhcp' at something other than the 3rd level.. ;)
Fair enough :)
I was more interested in whether a rule like '*.dhcp.*.{com|net|org|edu)' (blindly looking at the 3rd level domain and/or the 4th level for the two-letter TLDs) did any better/worse than having to maintain a list of 7K or so - are there enough variant forms that it's worth enumerating, or is it just that enumerating is easier than doing a wildcard?
Ah, I see what you're getting at. Well, I started maintaining my long list of patterns because of the insane complexity of trying to construct simple rules like the above. At one point, I had five or six of them, but it got easier to just run the vetted "generic" hostnames through a quick perl script to generate a regex for each, and then check them all. Surprisingly, on a reasonably fast system with a moderate mail load it runs through the entire set pretty quickly, and it doesn't take up as much RAM as I'd expected it would. I could probably get better stats if you're interested.
Quick example, though: of 6936 patterns currently in my list, if you just run a cut on \\ (which catches either '.' or '-' as the next char, for the most part) you get (matches of 20 or more):
count first left-hand pattern part ----- ---------------------------- 1572 ^[0-9]+ 206 ^.+ 200 ^host[0-9]+ 179 ^host 145 ^adsl 140 ^ip 121 ^ip[0-9]+ 121 ^.*[0-9]+ 89 ^dsl 83 ^ppp[0-9]+ 74 ^pc[0-9]+ 64 ^ppp 54 ^h[0-9]+ 52 ^dialup 48 ^dhcp 46 ^d[0-9]+ 45 ^dial 43 ^dhcp[0-9]+ 42 ^dsl[0-9]+ 40 ^user[0-9]+ 40 ^[a-z]+[0-9]+ 40 ^[0-f]+ 37 ^.+[0-9]+ 36 ^p[0-9]+ 36 ^[a-z]+ 36 ^.* 32 ^c[0-9]+ 32 ^adsl[0-9]+ 28 ^m[0-9]+ 28 ^cable 25 ^dyn 23 ^dial[0-9]+ 23 ^cable[0-9]+ 23 ^a[0-9]+ 22 ^user 22 ^s[0-9]+ 22 ^[a-z][0-9]+ 21 ^mail[0-9]+ 20 ^u[0-9]+ 20 ^pc 20 ^client
It's really not as simple as just blocking .*(dsl|cable|dialup).*; the zombie botnets are sophisticated and they're /everywhere/. So you can't just block the largest 25% most likely sources, as the spammers just rotate through until they find another you aren't testing for.
Throw in minor variations within a given ISP, language differences worldwide in naming conventions, and peculiarities in how sendmail's regex support works ('.' isn't picked up by '.+') and you've got a need for at least a few thousand patterns even if you strip off the domain part and try to match on the host part alone.