Re: Reasons why BIND isn't being upgraded
Apologies for the delay - but I read the list in digest and this is my first post. I have done a slightly bigger survey of BIND version strings as of 16:00GMT Thursday February 1st, taking the nameservers for Keynote40 Business to Consumer sites and a selection of the domainnames from the front page of www.persiankitty.com. Of 440 DNS servers surveyed; 35% are reporting strings that clearly indicate use of 8.2.3 or better (Hardly anyone is revealed to be using 9.0 or 9.1). 35% are reporting strings that strongly suggest they are susceptable to known BIND bugs (Some are rather old versions of BIND are out there). These figures charitably assume that; people who don't reveal version strings are not afflicted by BIND bugs. Bind variants that have source code modifications aren't affected - they probably are, but I said we were being charitable. So in answer to the question - are people too busy to upgrade BIND - clearly 35% of admins found time to do it already. Those who had upgraded to 8.2.2-P7 to avoid the previous DoS bugs should have had nothing more to do than replace the binaries and test at 8.2.3, so I think the time excuse is a bit thin. Interestingly the sexsites were only marginally behind the Keynote40 sites, in terms of DNS versions. So perhaps your credit card number is nearly as safe as your share portfolio. Of the strings returned the most humorous was "SIN" from a sexsite. The most constructive administrator entry was the admin e-mail address (But since this is in the SOA resource record and RFC2142 (you do have all the RFC2142 mail addresses don't you?!) I'm not awarding any prizes). The ISC.ORG web site recommends leaving the BIND version string unchanged to assist in troubleshooting. I remain unconvinced that showing the version string helps much. Certainly any site with 7x24 hour technical support should consider putting a relevant contact phone number IMHO - such that anyone trying to diagnose a genuine problem can get help, whilst nosey crackers can revert back to social engineering. (E-mail addresses may not be a good choice if your DNS or their DNS is playing up?!). Whilst you can add a "bind" zone of type Chaos and overwrite the default behaviour, as given in an earlier post. Later versions of BIND let you drop the version string in the options section, which is a lot less typing, and a lot easier to understand. An extract of my BIND 9.1rc1 named.conf is given below, clearly my server is only listening to the loopback address, but you'd be amazed how many times it has been queried in the last couple of weeks whilst I got the scripts sorted. options { directory "/etc/namedb"; // Working directory pid-file "named.pid"; // Put pid file in working dir listen-on { 127.0.0.1; }; // private server for performance listen-on-v6 { none; } ; // No version 6 IP version "Contact +44(0)1395 232769"; // Don't release version number }; version ""; doesn't do what I expected at 9.1rc1..... The survey mentioned above will form part of a bigger report, which was inspired by the Microsoft debacle (Aren't those Akamai DNS servers weird, oh and all Microsoft's e-mail relays have sequential IP addresses - what planet are they on), so I will be checking which domains have all their DNS servers (for which glue records exist) in the same network (Based on 'NetName' - better ideas welcome from routing experts - can I easily get IP to ASN?). If I have time I may include mail servers and relays as well as DNS servers, of course if all your DNS servers are on one network having off network mail relays won't help much when you muck up the routing. HTH Simon
On Fri, 2 Feb 2001, Simon Waters wrote:
So in answer to the question - are people too busy to upgrade BIND - clearly 35% of admins found time to do it already. Those who had upgraded to 8.2.2-P7 to avoid the previous DoS bugs should have had nothing more to do than replace the binaries and test at 8.2.3, so I think the time excuse is a bit thin.
This is untrue. I expected this same thing. Then I ran into these gems of bogosity while updating 8.2.2-P7 to 8.2.3: (1) 8.2.3 Doesn't accept the "(" in the SOA string to be on the next line after the IN SOA. Our script-generated zonefiles, about 45000 of them, all had this. (2) 8.2.3 Changed the meaning of the last field of the SOA record and needs a $TTL directive to cover the default TTL. This also affected all of our zones (86400 seconds timeout on negative caching is, you must agree, way over the top so not a value you want to propagate). (3) 8.2.3 Is unforgiving against errors in zonefiles. Where previously individual records were rejected (or served as-is), bind now insists on dropping the entire zone if something went wrong. Needless to say in a reload of 45K domains it takes a bit of time to fish out the bad ones. When downloading I expected a security upgrade, not a service pack. The extra traffic that new serial numbers for this amount of domains generate is probably well-measurable. The webpage, nor any of the obvious documentation (README, CHANGES) mentions any of these problems and I've been bitten by them. Yes we're running 8.2.3-REL fine now, but it took a couple of _expensive_ reloads to get everything right. If ISC wants my trust in the future of their codebase, they will have to work on seeing the difference between an "architecture upgrade" and a "security patch". Yes, the information was out there, but in the context it was presented (update now or the internet will die) I think they should make a stronger point out of the pitfalls. And you can certainly expect more people to run into these problems, then shrug and roll back to 8.2.2P7 because they bumped into the same wall. Cheers, Pi
Pim van Riezen (pi@vuurwerk.nl) said, on [010201 17:29]:
This is untrue. I expected this same thing. Then I ran into these gems of bogosity while updating 8.2.2-P7 to 8.2.3:
(1) 8.2.3 Doesn't accept the "(" in the SOA string to be on the next line after the IN SOA. Our script-generated zonefiles, about 45000 of them, all had this.
Not accepting a bogus zone file is hardly classifiable as "bogosity".
documentation (README, CHANGES) mentions any of these problems and I've been bitten by them. Yes we're running 8.2.3-REL fine now, but it took a couple of _expensive_ reloads to get everything right. If ISC wants my trust in the future of their codebase, they will have to work on seeing the difference between an "architecture upgrade" and a "security patch".
So, you deployed a new version of bind to a non-trivial set of production servers without doing any testing on development or QA systems, and you're blaming your production problems on the isc? I'm fairly certain that I'm glad you're not running my network, thankyewverymuch. -P.
On Thu, 1 Feb 2001, Pete Ehlke wrote:
Pim van Riezen (pi@vuurwerk.nl) said, on [010201 17:29]:
This is untrue. I expected this same thing. Then I ran into these gems of bogosity while updating 8.2.2-P7 to 8.2.3:
(1) 8.2.3 Doesn't accept the "(" in the SOA string to be on the next line after the IN SOA. Our script-generated zonefiles, about 45000 of them, all had this.
Not accepting a bogus zone file is hardly classifiable as "bogosity".
Parsing human input isn't hard, you know. Robustness doesn't come from being anal. If there's a bogus entry, reject the entry not the entire zone. The rejection as such doesn't even classify as bogosity, it's the fact that this rejection is _introduced_ in a 0.0.1 upgrade that is advertised as an Urgent Security Fix and is being discussed right here on this list in wonder considering why some people haven't upgraded yet. I'm telling you that if I run into these problems (and manage to eventually fix them) others will too and it will be for these reasons. I also seriously counter your claim that having this bracket on the next line is in any way bogus. It's visually superior to the now enforced option of having it on the same line. There is nothing in the parser not to understand it. Spreading data across lines is commonly accepted in a lot of configuration languages and bind has been among this in all versions I previously ran. Why is that now suddenly bogus?
documentation (README, CHANGES) mentions any of these problems and I've been bitten by them. Yes we're running 8.2.3-REL fine now, but it took a couple of _expensive_ reloads to get everything right. If ISC wants my trust in the future of their codebase, they will have to work on seeing the difference between an "architecture upgrade" and a "security patch".
So, you deployed a new version of bind to a non-trivial set of production servers without doing any testing on development or QA systems, and you're blaming your production problems on the isc? I'm fairly certain that I'm glad you're not running my network, thankyewverymuch.
I followed all-out instructions to immediately upgrade to the new bind because of alleged gaping security holes. We are short on staff on a major scale[1] and our secondary was coping just fine. My complaint is not that "ISC broke my network", noone did. My complaint was that I had to spend a lot of time figuring out what should be blisteringly obvious. I thank you for your character judgement anyway. May I add that I'm glad I'm not working on your network, too? Your noc-list must be really cozy. Hope you do get some work done besides excercising quick wit and situational prejudice on other people. Cheers, Pi [1] I'm the head of development, that's how short we are on admin staff ok? And this situation is not unique. Welcome to the real world. There are more Wanters in this world than there are Makers.
At 3:54 AM +0100 2/2/01, Pim van Riezen wrote:
I also seriously counter your claim that having this bracket on the next line is in any way bogus. It's visually superior to the now enforced option of having it on the same line. There is nothing in the parser not to understand it. Spreading data across lines is commonly accepted in a lot of configuration languages and bind has been among this in all versions I previously ran. Why is that now suddenly bogus?
According to RFC1035, it's ALWAYS been bogus. It's just been a bug that BIND's parser that accepted the bogus data. It's funny that when I reported this as a bug when bind9 testing was going on, ISC was like "it accepts paren on the next line?" ... since bind9 was written from scratch, it didn't inherit the legacy bogosity of accepting \n(, and crapped out, just as 8.2.3 did. It appears that my (and others, I'm sure) bug report on 9.0.0beta-whatever caused the re-evaluation of the 8.2.x codebase to fix that bug in 8.x. So I guess I'll apologize for opening my trap. ;-) D -- +---------------------+-----------------------------------------+ | dredd@megacity.org | "Conan! What is best in life?" | | Derek J. Balling | "To crush your enemies, see them | | | driven before you, and to hear the | | | lamentation of their women!" | +---------------------+-----------------------------------------+
On Thu, 1 Feb 2001, Derek J. Balling wrote:
According to RFC1035, it's ALWAYS been bogus. It's just been a bug that BIND's parser that accepted the bogus data.
Out of curiosity, why? Okay, we'll accept that that's the way it is, and I'm not going to rail about how it shouldn't be, but let's try to understand Mr. Mockapetris' reasoning in this. I see this from the RFC (that lists P. Mockapetris as author, for those who don't place the name): The format of these files is a sequence of entries. Entries are predominantly line-oriented, though parentheses can be used to continue a list of items across a line boundary, and text literals can contain CRLF within the text. also: ( ) Parentheses are used to group data that crosses a line boundary. In effect, line terminations are not recognized within parentheses. The only clue I can find to his reasoning is: Because these files are text files several special encodings are necessary to allow arbitrary data to be loaded. ...which is patently untrue, given that BIND didn't need those special encodings; it worked just fine without the particular one that we're all beyond bored of hearing about. So: Why? The fact that pre-8.3 versions of BIND will accept @ IN SOA ( ns.whomever.com and etc. does not seem like a huge issue to me, though fixing that in thousands of issues does. What's the big deal with leaving it non-compliant? Matthew Devney
[ On Thursday, February 1, 2001 at 22:16:13 ( -0800), mdevney@teamsphere.com wrote: ]
Subject: Re: [NANOG] Re: Reasons why BIND isn't being upgraded
Out of curiosity, why?
Because (as you quoted from RFC 1035):
The format of these files is a sequence of entries. Entries are predominantly line-oriented, though parentheses can be used to continue a list of items across a line boundary, and text literals can contain CRLF within the text.
I.e. I thought it was self-obvious. Records are one per line unless an open parenthesis `(' or double quote `"' is encountered in which all fields up to the closing `)' or `"' are part of the current record even if there are intervening newlines. What more justification do you need? The parenthesis aren't part of the record -- they're just a way of getting the additional values onto separate lines. If you put all those numbers on the same line with the rest of the record then you don't need the parenthesis at all. This was one of the very first lessons I learned when I wrote my very first zone file and tried to do something with parenthesis that didn't work like I naively expected it to. Yes the master file format goes against the grain of modern data file interpretations, but, well, nobody's written a new RFC and proposed that it replace 1035 yet, so we're still stuck with it. Personally I'd like to at least do away with the semi-colon as the comment character, but I'm not going to write a new RFC and push it through the IETF just for that! ;-)
The only clue I can find to his reasoning is: Because these files are text files several special encodings are necessary to allow arbitrary data to be loaded.
well you might want to look at the date on RFC 1035, and even more particularly at the dates of the RFCs it supercedes. Back in 1983 the tools available to deal with this kind of data were very different and possibly much more restricted. All the world was not yet a VAX back then, it was still a PDP-11! :-) However the real clue is (again from RFC 1035): 5.1. Format [[ ... ]] The following entries are defined: [[ ... ]] <domain-name><rr> [<comment>] <blank><rr> [<comment>] [[ ... ]] The last two forms represent RRs. If an entry for an RR begins with a blank, then the RR is assumed to be owned by the last stated owner. If an RR entry begins with a <domain-name>, then the owner name is reset. I.e. if a line begins with whitespace then it is a new record that defaults to having the same domain name (and class) as the previous record. So, a modified example like the one you've used for the SOA: @ IN SOA master contact ( 1 2 3 4 5 ) is interpreted as this: @ IN SOA master contact @ IN 1 2 3 4 5 Now what? Both records are broken, with missing fields for the SOA, and an invalid type (`1') and probably too many fields for the next line!
...which is patently untrue, given that BIND didn't need those special encodings; it worked just fine without the particular one that we're all beyond bored of hearing about.
Sure, BIND was OK when handling this one very special case, but since the master file format is intended to be portable to other implementations, the question to as is whether some other conforming implementation could mis-interpret this kind of error and cause the zone to be rejected, and the answer is most certainly YES! Sure BIND could do away with support for RFC 1035 standard master file formats, but where would that get us with so many other tools generating and parsing these files?
So: Why? The fact that pre-8.3 versions of BIND will accept @ IN SOA ( ns.whomever.com
I'm not so sure they all can. You might want to check the pre-8.x versions before asserting this. IIRC this sloppyness is a relatively recent abberation in BIND. -- Greg A. Woods +1 416 218-0098 VE3TCP <gwoods@acm.org> <robohack!woods> Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>
Pim van Riezen (pi@vuurwerk.nl) said, on [010201 18:58]:
Parsing human input isn't hard, you know. Robustness doesn't come from being anal. If there's a bogus entry, reject the entry not the entire zone. The rejection as such doesn't even classify as bogosity, it's the
I fail to understand this. You seem to suggest that a name server should reject the SOA record, but accept and attempt to serve the zone. Precisely how would that work?
I also seriously counter your claim that having this bracket on the next line is in any way bogus. It's visually superior to the now enforced option of having it on the same line. There is nothing in the parser not to understand it. Spreading data across lines is commonly accepted in a lot of configuration languages and bind has been among this in all versions I previously ran. Why is that now suddenly bogus?
Because rfc1035 has always defined it as bogus. The parenthesis is, as you are now no doubt aware, a line continuation character: 5.1. Format The format of these files is a sequence of entries. Entries are predominantly line-oriented, though parentheses can be used to continue a list of items across a line boundary, and text literals can contain CRLF within the text. Any combination of tabs and spaces act as a delimiter between the separate items that make up an entry. The end of any line in the master file can end with a comment. The comment starts with a ";" (semicolon). -P.
[ On Friday, February 2, 2001 at 02:22:56 (+0100), Pim van Riezen wrote: ]
Subject: Re: [NANOG] Re: Reasons why BIND isn't being upgraded
This is untrue. I expected this same thing. Then I ran into these gems of bogosity while updating 8.2.2-P7 to 8.2.3:
(1) 8.2.3 Doesn't accept the "(" in the SOA string to be on the next line after the IN SOA. Our script-generated zonefiles, about 45000 of them, all had this.
I seem to recall having that problem with much older versions (maybe even 4.9.x) and I've used this format for a very long time now: @ IN SOA localhost. postmaster.localhost. ( 2000041800 ; Serial number (yyyymmddhh) 8h ; Refresh 2h ; Retry 1w ; Expire 3h ) ; Minimum TTL (except of course for the more sensible units of time now allowed)
(2) 8.2.3 Changed the meaning of the last field of the SOA record and needs a $TTL directive to cover the default TTL. This also affected all of our zones (86400 seconds timeout on negative caching is, you must agree, way over the top so not a value you want to propagate).
The use of $TTL was well documented since at least 8.2.1, and indeed recent versions also syslog'ed complaints if it was missing. This is from doc/html/master.html: $TTL Syntax: $TTL <default-ttl> [<comment>] Set the default Time To Live (TTL) for subsequent records with undefined TTL's. Valid TTL's are of the range 0-2147483647. $TTL is defined in RFC 2308. (any specification of units of time can be used in place of a plain integer number of seconds). And of course RFC 2308 is from 1998, so not very new. I've been slowly adding $TTL to manually edited zone files for the past couple of years and didn't have really all that many to fix in the past few days.
(3) 8.2.3 Is unforgiving against errors in zonefiles. Where previously individual records were rejected (or served as-is), bind now insists on dropping the entire zone if something went wrong. Needless to say in a reload of 45K domains it takes a bit of time to fish out the bad ones.
So far as I saw 8.2.2-P7 logged all of these errors and so could have given you ample time to fix up all of the problems just like it did for me. Of course I've had "check-names master fail;" in my options clause for a couple of years too, so I don't notice any difference with 8.2.3. In any case it is far better to drop the entire zone than it is to load a borked one, especially if you don't catch it in time before a half dozen secondaries all tranfer the damage. First off it gives you a much more direct indication of the presense of a problem (your server is suddenly lame for the zone so even if you forget to read the log messages after a reload you'll still find out about the problem). Furthermore your secondaries will still continue to answer authoritatively with the previous revision, at least until they expire, thus preventing unexpected damage (the presense of the old version is expected, at least for some limited amount of time, after all)
When downloading I expected a security upgrade, not a service pack.
As the version number indicated you installed a new release, not a patch (though in the past even BIND patch releases with "-PN" suffixes on their identifiers have often included more than just pure fixes). Any presumptions about what a +0.0.1 version number increment means are likely to be incorrect unless you have intimate knowledge of the project producing the release. Unfortunately there was no patch this time.... I suspect a patch could have been possible, and it seems one may soon be on its way, but given the relative history of the deployment of new releases of BIND, or lack thereof, it does in some ways make sense to only make a new release available. As a developer you no doubt know full well that developing patch releases in conjunction with full releases more than doubles the amount of effort necessary for SCM and QA tasks. Unfortunately BIND has never come with the equivalent of a GNU "NEWS" file to mention explicitly all of the user-visible differences and with all new releases it sometimes a bit of an adventure to discover all the new features and any incompatibilities. -- Greg A. Woods +1 416 218-0098 VE3TCP <gwoods@acm.org> <robohack!woods> Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>
On Thu, 1 Feb 2001, Greg A. Woods wrote:
So far as I saw 8.2.2-P7 logged all of these errors and so could have given you ample time to fix up all of the problems just like it did for me.
In an ideal world, I agree with you. The base of all of this discussion was how it can be that so many people take a while upgrading bind. I think the situation I ran into is a realistic real world scenario. Understaffed organizations will run into their own walls pretty easily when trying to tackle the upgrade.
[snip serving borken zones is bad]
I am near to agreeing with you if it were about not picking up a zone-change when the zonefile has turned bogus. However, the effect of a zone no longer being authoritative on the primary is not really what I'd define as fun either :).
Unfortunately BIND has never come with the equivalent of a GNU "NEWS" file to mention explicitly all of the user-visible differences and with all new releases it sometimes a bit of an adventure to discover all the new features and any incompatibilities.
Voila, I think that this is what my problem was. Like I said, the information _was_ out there, it was just not intuitively available. So the upgrade will scare some people off, if they don't manage to find it. Cheers, Pi
Voila, I think that this is what my problem was. Like I said, the information _was_ out there, it was just not intuitively available. So the upgrade will scare some people off, if they don't manage to find it.
You are in an unenviable position. But in support of ISC, this problem is listed in FAQ which is reachable through www.isc.org.
[ On Friday, February 2, 2001 at 12:48:50 (+0100), Pim van Riezen wrote: ]
Subject: Re: [NANOG] Re: Reasons why BIND isn't being upgraded
[snip serving borken zones is bad]
I am near to agreeing with you if it were about not picking up a zone-change when the zonefile has turned bogus. However, the effect of a zone no longer being authoritative on the primary is not really what I'd define as fun either :).
Well, strictly speaking not dropping the zone when any error is encountered during its load is contrary to the requirements of RFC 1035. (section 5.2, which gives very much the reasons I did, but without mentioning zone transfers explicitly since of course any errant record, or missing record, can be propogated for its TTL or negative TTL) It might not be fun to have your primary be lame for one or zillions of zones (even if it's an unadvertised primary), but it's not dangerous (at least not unless you're already violating dozens of other DNS requirements). The "non-fun" should merely be incentive to get you to correct your procedures and process so that future errors are caught before they're loaded. :-) -- Greg A. Woods +1 416 218-0098 VE3TCP <gwoods@acm.org> <robohack!woods> Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>
participants (7)
-
Derek J. Balling
-
J Bacher
-
mdevney@teamsphere.com
-
Pete Ehlke
-
Pim van Riezen
-
Simon Waters
-
woods@weird.com