Exploits start against flaw that could hamstring huge swaths of Internet | Ars Technica
Everyone got BIND updated? http://arstechnica.com/security/2015/08/exploits-start-against-flaw-that-cou... -- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
On Tue, Aug 4, 2015 at 10:17 AM, Stephane Bortzmeyer <bortzmeyer@nic.fr> wrote:
On Tue, Aug 04, 2015 at 10:03:33AM -0400, Jay Ashworth <jra@baylink.com> wrote a message of 6 lines which said:
Everyone got BIND updated?
For instance by replacing it with NSD or Unbound?
always great to jump ship from one platform to another ... under stress and without knowing scaling, management, etc properties. Also, it's not like the alternatives have clean shorts when it comes to code mistakes, right?
On Tue, Aug 04, 2015 at 10:03:33AM -0400, Jay Ashworth <jra@baylink.com> wrote a message of 6 lines which said:
Everyone got BIND updated?
For instance by replacing it with NSD or Unbound?
Or doing something better like not just replacing one evil with another, and instead moving to a heterogeneous environment where possible. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
So, you guys recommend replace Bind for another option ? -----Mensagem original----- De: NANOG [mailto:nanog-bounces@nanog.org] Em nome de Joe Greco Enviada em: terça-feira, 4 de agosto de 2015 12:01 Para: Stephane Bortzmeyer Cc: nanog@nanog.org Assunto: Re: Exploits start against flaw that could hamstring huge swaths of
On Tue, Aug 04, 2015 at 10:03:33AM -0400, Jay Ashworth <jra@baylink.com> wrote a message of 6 lines which said:
Everyone got BIND updated?
For instance by replacing it with NSD or Unbound?
Or doing something better like not just replacing one evil with another, and instead moving to a heterogeneous environment where possible. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
On Tue, Aug 4, 2015 at 11:06 AM, Leonardo Oliveira Ortiz <leonardo.ortiz@marisolsa.com> wrote:
So, you guys recommend replace Bind for another option ?
The humorous thing is that the security researcher who showed the recent bind9 error (note: it isn't a vulnerability or a hack, it's just a way to remotely crash named), well, he criticized bind9 for doing more than simple basic name services. So, it's very easy to find bind9 alternatives if you are only looking for basic minimal DNS functionality. But once you start looking for features... well there aren't many options. -Jim P.
On Tue, 04 Aug 2015 15:06:36 -0000, Leonardo Oliveira Ortiz said:
So, you guys recommend replace Bind for another option ?
The *good* recommendation is to get some onboard security clue, and learn procedures to mitigate the inevitable exploits against flaws in infrastructure software.
So, you guys recommend replace Bind for another option ?
No. Replacing one occasionally faulty product with another occasionally faulty product is foolish. There's no particular reason to think that another product will be impervious to code bugs. What I was suggesting was to use several different devices, much as some networks prefer to buy some Cisco gear and some Juniper gear and make them redundant, or as a well-built ZFS storage array consists of drives from different manufacturers. Heterogeneous environments tend to be more resilient because they are less likely to all suffer the same defect at once. Problems still result in some pain and trouble, but it usually doesn't result in a service outage. This doesn't seem like a horribly catastrophic bug in any case. Anyone who is reliant on a critical bit like a DNS server probably has it set up to automatically restart if it doesn't exit cleanly. If you don't, you should! So if it matters to you, I suggest that you instead use a combination of different products, and you'll be more resilient. If you have two recursers for your customers, one can be BIND and one can be Unbound. And when some critical vuln comes along and knocks out Unbound, you'll still be resolving names. Ditto BIND. You're not likely to see both happen at the same time. However, at least here, we actually *use* TSIG updates, and other functionality that'd be hard to replace (BIND9 is pretty much THE only option for some functionality). ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
With the (large) caveat that heterogenous networks are more subject to human error in many cases. On Aug 4, 2015 9:25 AM, "Joe Greco" <jgreco@ns.sol.net> wrote:
So, you guys recommend replace Bind for another option ?
No. Replacing one occasionally faulty product with another occasionally faulty product is foolish. There's no particular reason to think that another product will be impervious to code bugs. What I was suggesting was to use several different devices, much as some networks prefer to buy some Cisco gear and some Juniper gear and make them redundant, or as a well-built ZFS storage array consists of drives from different manufacturers.
Heterogeneous environments tend to be more resilient because they are less likely to all suffer the same defect at once. Problems still result in some pain and trouble, but it usually doesn't result in a service outage.
This doesn't seem like a horribly catastrophic bug in any case. Anyone who is reliant on a critical bit like a DNS server probably has it set up to automatically restart if it doesn't exit cleanly. If you don't, you should!
So if it matters to you, I suggest that you instead use a combination of different products, and you'll be more resilient. If you have two recursers for your customers, one can be BIND and one can be Unbound. And when some critical vuln comes along and knocks out Unbound, you'll still be resolving names. Ditto BIND. You're not likely to see both happen at the same time.
However, at least here, we actually *use* TSIG updates, and other functionality that'd be hard to replace (BIND9 is pretty much THE only option for some functionality).
... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
On Tue, Aug 4, 2015 at 11:29 AM, Scott Helms <khelms@zcorum.com> wrote:
With the (large) caveat that heterogenous networks are more subject to human error in many cases.
<cough>automate!</cough>
On Aug 4, 2015 9:25 AM, "Joe Greco" <jgreco@ns.sol.net> wrote:
So, you guys recommend replace Bind for another option ?
No. Replacing one occasionally faulty product with another occasionally faulty product is foolish. There's no particular reason to think that another product will be impervious to code bugs. What I was suggesting was to use several different devices, much as some networks prefer to buy some Cisco gear and some Juniper gear and make them redundant, or as a well-built ZFS storage array consists of drives from different manufacturers.
Heterogeneous environments tend to be more resilient because they are less likely to all suffer the same defect at once. Problems still result in some pain and trouble, but it usually doesn't result in a service outage.
This doesn't seem like a horribly catastrophic bug in any case. Anyone who is reliant on a critical bit like a DNS server probably has it set up to automatically restart if it doesn't exit cleanly. If you don't, you should!
So if it matters to you, I suggest that you instead use a combination of different products, and you'll be more resilient. If you have two recursers for your customers, one can be BIND and one can be Unbound. And when some critical vuln comes along and knocks out Unbound, you'll still be resolving names. Ditto BIND. You're not likely to see both happen at the same time.
However, at least here, we actually *use* TSIG updates, and other functionality that'd be hard to replace (BIND9 is pretty much THE only option for some functionality).
... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
Automation just means your mistake goes many more places more quickly. On Aug 4, 2015 9:38 AM, "Christopher Morrow" <morrowc.lists@gmail.com> wrote:
On Tue, Aug 4, 2015 at 11:29 AM, Scott Helms <khelms@zcorum.com> wrote:
With the (large) caveat that heterogenous networks are more subject to human error in many cases.
<cough>automate!</cough>
On Aug 4, 2015 9:25 AM, "Joe Greco" <jgreco@ns.sol.net> wrote:
So, you guys recommend replace Bind for another option ?
No. Replacing one occasionally faulty product with another occasionally faulty product is foolish. There's no particular reason to think that another product will be impervious to code bugs. What I was suggesting was to use several different devices, much as some networks prefer to buy some Cisco gear and some Juniper gear and make them redundant, or as a well-built ZFS storage array consists of drives from different manufacturers.
Heterogeneous environments tend to be more resilient because they are less likely to all suffer the same defect at once. Problems still result in some pain and trouble, but it usually doesn't result in a service outage.
This doesn't seem like a horribly catastrophic bug in any case. Anyone who is reliant on a critical bit like a DNS server probably has it set up to automatically restart if it doesn't exit cleanly. If you don't, you should!
So if it matters to you, I suggest that you instead use a combination of different products, and you'll be more resilient. If you have two recursers for your customers, one can be BIND and one can be Unbound. And when some critical vuln comes along and knocks out Unbound, you'll still be resolving names. Ditto BIND. You're not likely to see both happen at the same time.
However, at least here, we actually *use* TSIG updates, and other functionality that'd be hard to replace (BIND9 is pretty much THE only option for some functionality).
... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
On Tue, Aug 4, 2015 at 11:46 AM, Scott Helms <khelms@zcorum.com> wrote:
Automation just means your mistake goes many more places more quickly.
and letting people keep poking at things that computers should be doing is... much worse. people do not have reliability and repeat-ability over time. If you fear 'many more places' problems, improve your testing.
On Aug 4, 2015 9:38 AM, "Christopher Morrow" <morrowc.lists@gmail.com> wrote:
On Tue, Aug 4, 2015 at 11:29 AM, Scott Helms <khelms@zcorum.com> wrote:
With the (large) caveat that heterogenous networks are more subject to human error in many cases.
<cough>automate!</cough>
On Aug 4, 2015 9:25 AM, "Joe Greco" <jgreco@ns.sol.net> wrote:
So, you guys recommend replace Bind for another option ?
No. Replacing one occasionally faulty product with another occasionally faulty product is foolish. There's no particular reason to think that another product will be impervious to code bugs. What I was suggesting was to use several different devices, much as some networks prefer to buy some Cisco gear and some Juniper gear and make them redundant, or as a well-built ZFS storage array consists of drives from different manufacturers.
Heterogeneous environments tend to be more resilient because they are less likely to all suffer the same defect at once. Problems still result in some pain and trouble, but it usually doesn't result in a service outage.
This doesn't seem like a horribly catastrophic bug in any case. Anyone who is reliant on a critical bit like a DNS server probably has it set up to automatically restart if it doesn't exit cleanly. If you don't, you should!
So if it matters to you, I suggest that you instead use a combination of different products, and you'll be more resilient. If you have two recursers for your customers, one can be BIND and one can be Unbound. And when some critical vuln comes along and knocks out Unbound, you'll still be resolving names. Ditto BIND. You're not likely to see both happen at the same time.
However, at least here, we actually *use* TSIG updates, and other functionality that'd be hard to replace (BIND9 is pretty much THE only option for some functionality).
... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
I don't disagree, but automation usually protects against typing errors, it doesn't protect against incorrect configurations. Using multiple vendors or server software means that your people have to know all of the systems. There are many cases where, for example, a Cisco like CLI will make a network engineer think that a command works exactly the same way on another vendors system when in fact the under the hood implementation is very different. It's not always feasible to have the people with the needed skill levels and automation does not help that at all. On Aug 4, 2015 10:21 AM, "Christopher Morrow" <morrowc.lists@gmail.com> wrote:
On Tue, Aug 4, 2015 at 11:46 AM, Scott Helms <khelms@zcorum.com> wrote:
Automation just means your mistake goes many more places more quickly.
and letting people keep poking at things that computers should be doing is... much worse. people do not have reliability and repeat-ability over time.
If you fear 'many more places' problems, improve your testing.
On Aug 4, 2015 9:38 AM, "Christopher Morrow" <morrowc.lists@gmail.com> wrote:
On Tue, Aug 4, 2015 at 11:29 AM, Scott Helms <khelms@zcorum.com> wrote:
With the (large) caveat that heterogenous networks are more subject to human error in many cases.
<cough>automate!</cough>
On Aug 4, 2015 9:25 AM, "Joe Greco" <jgreco@ns.sol.net> wrote:
So, you guys recommend replace Bind for another option ?
No. Replacing one occasionally faulty product with another occasionally faulty product is foolish. There's no particular reason to think
that
another product will be impervious to code bugs. What I was suggesting was to use several different devices, much as some networks prefer to buy some Cisco gear and some Juniper gear and make them redundant, or as a well-built ZFS storage array consists of drives from different manufacturers.
Heterogeneous environments tend to be more resilient because they are less likely to all suffer the same defect at once. Problems still result in some pain and trouble, but it usually doesn't result in a service outage.
This doesn't seem like a horribly catastrophic bug in any case. Anyone who is reliant on a critical bit like a DNS server probably has it set up to automatically restart if it doesn't exit cleanly. If you don't, you should!
So if it matters to you, I suggest that you instead use a combination of different products, and you'll be more resilient. If you have two recursers for your customers, one can be BIND and one can be Unbound. And when some critical vuln comes along and knocks out Unbound, you'll still be resolving names. Ditto BIND. You're not likely to see both happen at the same time.
However, at least here, we actually *use* TSIG updates, and other functionality that'd be hard to replace (BIND9 is pretty much THE only option for some functionality).
... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
On 4 Aug 2015, at 23:21, Christopher Morrow wrote:
and letting people keep poking at things that computers should be doing is... much worse. people do not have reliability and repeat-ability over time.
I've personally never come across an accidental route hijack (of the subset of which I learned the actual details of what happened) that wasn't the result of someone manually typing at the enable prompt. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>
hi ya
On Tue, Aug 4, 2015 at 11:29 AM, Scott Helms <khelms@zcorum.com> wrote:
With the (large) caveat that heterogenous networks are more subject to human error in many cases.
<cough>automate!</cough>
...
On 08/04/15 at 12:21pm, Christopher Morrow wrote:
On Tue, Aug 4, 2015 at 11:46 AM, Scott Helms <khelms@zcorum.com> wrote:
Automation just means your mistake goes many more places more quickly.
and letting people keep poking at things that computers should be doing is... much worse. people do not have reliability and repeat-ability over time.
ditto ... computers are experts at listening and repeatatively doing what it's told to do ..
If you fear 'many more places' problems, improve your testing.
i prefer automation .. even if it's wrong, you can look at the script and see what bad things it did and you should know what to do to fix the problem and fix the script to prevent it from spreading that mistake again <person's standard excuse> if you ask a person(s), what did you do to create this mess, "duh... i donno" btw, it's my kids birthday, i needed to be home an hr ago with the cake :-) hummm... :-) </standard> ----- <fwiw> for automation to work: - folks updating the scripts should be required to know all platforms being used and how its different from each other - folks testing the scripts/updates process/proceedures should be paid bonuses, even free pizza/beer for finding bugs before release to the your internal world of automated-machines - always have 3 co-developments boxes for the script develpment and to backup each other - always have 2 or more test bed boxes for initial releases of new scripts where those boxes can also be downgraded back to the previous release before the new patches was applied - if nothing went wrong, there should be minimal issue with release a patch where it doesn't propagate "problems automatically to everywhere" the trick is "how good are the eyes/brains" that is looking for potential problems of the new releases/patches/updates/etc - i also say always let clients pull down patches vs pushing it to systems that seems un-responsive to avoid having to wait for dead boxes ----- all appps, not just bind, has occasional problems .. changing to something else doesn't necessarily solve the original "bug" problem pixie dust alvin # ddos-mitigator.net
Wow this thread went off-track in nanoseconds. So which bind versions are ok? -b
On 4 Aug 2015, at 15:54, Barry Shein wrote:
Wow this thread went off-track in nanoseconds.
So which bind versions are ok?
9.10.2-P3 is marked "current stable", and 9.9.7-P2 is marked "current-stable ESV" at: https://www.isc.org/downloads/ The bind-users is probably a place where this kind of thread would at least go off-track in a different set of ways: https://lists.isc.org/mailman/listinfo/bind-users Joe
Automation just means your mistake goes many more places more quickly. and letting people keep poking at things that computers should be doing is... much worse. people do not have reliability and repeat-ability over time.
i love the devops movement; operators discover that those computers can be programmed. wowzers! maybe in a decade or two, we will discover mathematics. nah. randy
On Tue, Aug 4, 2015 at 4:53 PM, Randy Bush <randy@psg.com> wrote:
i love the devops movement; operators discover that those computers can be programmed. wowzers!
Maybe we can give them a new title. I'm thinking, "System Programmer."
Guys, Red Hat have a release with the patch on CR repository. Should we update using the rpm on CR or using the source provide by ISC ? The release on CR is: 9.8.2rc1-RedHat-9.8.2-0.37.rc1.el6_7.2 -----Mensagem original----- De: NANOG [mailto:nanog-bounces@nanog.org] Em nome de Randy Bush Enviada em: terça-feira, 4 de agosto de 2015 19:53 Para: Christopher Morrow Cc: NANOG; Joe Greco Assunto: Re: RES: Exploits start against flaw that could hamstring huge swaths of
Automation just means your mistake goes many more places more quickly. and letting people keep poking at things that computers should be doing is... much worse. people do not have reliability and repeat-ability over time.
i love the devops movement; operators discover that those computers can be programmed. wowzers! maybe in a decade or two, we will discover mathematics. nah. randy
----- Original Message -----
From: "Scott Helms" <khelms@zcorum.com>
On Aug 4, 2015 9:38 AM, "Christopher Morrow" <morrowc.lists@gmail.com> wrote:
On Tue, Aug 4, 2015 at 11:29 AM, Scott Helms <khelms@zcorum.com> wrote:
With the (large) caveat that heterogenous networks are more subject to human error in many cases.
<cough>automate!</cough>
Automation just means your mistake goes many more places more quickly.
Not necessarily. The sort of failure you're talking about, Scott, is "user did the wrong thing", and sure, automation makes it easier for that to spread. Chris was, though, I think, suggesting automating around "user tries to do the right thing on disjoint devices, and fails *because they're disjoint*"; that is, clearly, a problem automation can help with. Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
With the (large) caveat that heterogenous networks are more subject to human error in many cases.
Indeed. Everything comes with tradeoffs. More intimate familiarity with the product and a uniformity of deployment strategy has made it more practical here to stick with BIND; an update is a simple matter of a tarball and running a script that manages the dirty work. However, the original point was that switching from BIND to Unbound or other options is silly, because you're just trading one codebase for another, and they all have bugs. However, collectively, two different products cooperatively providing a service are likely to have a higher uptime in a well-designed environment. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
On 4 August 2015 at 18:48, Joe Greco <jgreco@ns.sol.net> wrote:
However, the original point was that switching from BIND to Unbound or other options is silly, because you're just trading one codebase for another, and they all have bugs.
It is equally silly to assume that all codebase are the same quality and have equally many bugs. Maybe we should be looking at the track record of those two products and maybe we should let someone do a code review. And then choose based on that. Regards, Baldur
On Tue, Aug 4, 2015 at 12:51 PM, Baldur Norddahl <baldur.norddahl@gmail.com> wrote:
On 4 August 2015 at 18:48, Joe Greco <jgreco@ns.sol.net> wrote:
However, the original point was that switching from BIND to Unbound or other options is silly, because you're just trading one codebase for another, and they all have bugs.
It is equally silly to assume that all codebase are the same quality and have equally many bugs. Maybe we should be looking at the track record of those two products and maybe we should let someone do a code review. And then choose based on that.
because: 1) historical results matter here? (who looked at which products over what period of time, with what attention to detail(s) and which sets of goals?) 2) the single person doing a code review is likely to see all of the problems in each of the products selected? nothing against any of the software in question here, but really this is all quite a crapshoot and past transgression research doesn't make for a great tool to plan for the future. Joe's right: "all software has bugs, find the software and strategy that makes sense for your organization" that MIGHT mean 2 platforms (seems sensible to me!) and it might mean automation for management of configs (from an abstraction so you can generate the right data to each target implementation) or it might mean more monkeys on keyboards if you don't believe in automation. -chris
Den 04/08/2015 19.18 skrev "Christopher Morrow" <morrowc.lists@gmail.com>:
On Tue, Aug 4, 2015 at 12:51 PM, Baldur Norddahl <baldur.norddahl@gmail.com> wrote:
On 4 August 2015 at 18:48, Joe Greco <jgreco@ns.sol.net> wrote:
However, the original point was that switching from BIND to Unbound or other options is silly, because you're just trading one codebase for another, and they all have bugs.
It is equally silly to assume that all codebase are the same quality and have equally many bugs. Maybe we should be looking at the track record
of
those two products and maybe we should let someone do a code review. And then choose based on that.
because: 1) historical results matter here? (who looked at which products over what period of time, with what attention to detail(s) and which sets of goals?) 2) the single person doing a code review is likely to see all of the problems in each of the products selected?
Maybe not but a code review can tell what methods are used to safe guard against security bugs, the general quality of the code, the level of automated testing etc. History can give hints to the same. If it had a lot of bugs discovered it is likely it is not good quality in a security perspective and more bugs can be expected. It is called due diligence. The aim is not to find the bugs but to evaluate the product. Regards Baldur
I recommend using DNSDIST to balance traffic at a protocol level as you can have implementation diversity on the backside. I can send an example config out later for people. You can balance to bind NSD and others all at the same time :-) just move your SPoF Jared Mauch
On Aug 4, 2015, at 10:03 AM, Jay Ashworth <jra@baylink.com> wrote:
Everyone got BIND updated?
http://arstechnica.com/security/2015/08/exploits-start-against-flaw-that-cou... -- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
In message <9C2ACA5A-755D-4FCF-8491-745A1F9111BA@puck.nether.net>, Jared Mauch writes:
I recommend using DNSDIST to balance traffic at a protocol level as you can h= ave implementation diversity on the backside.=20
I can send an example config out later for people. You can balance to bind N= SD and others all at the same time :-) just move your SPoF
Jared Mauch
Unless the same client hits the same server all the time this is a bad idea. Resolvers actually track capabilities of servers as it is the only way to get answers due to firewalls dropping legitimate packet and protocol misimplementations. Add to that different vendors / versions supporting different extensions randomly flipping between vendors / versions is frought with danger unless you take extreme care.
On Aug 4, 2015, at 10:03 AM, Jay Ashworth <jra@baylink.com> wrote:
Everyone got BIND updated?
http://arstechnica.com/security/2015/08/exploits-start-against-flaw-that-c ould-hamstring-huge-swaths-of-internet/
-- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On Tue, Aug 4, 2015 at 9:39 AM, Mark Andrews <marka@isc.org> wrote:
In message <9C2ACA5A-755D-4FCF-8491-745A1F9111BA@puck.nether.net>, Jared Mauch writes:
I recommend using DNSDIST to balance traffic at a protocol level as you can h= ave implementation diversity on the backside.=20
I can send an example config out later for people. You can balance to bind N= SD and others all at the same time :-) just move your SPoF
Unless the same client hits the same server all the time this is a bad idea.
But tying a set of clients to the same backend puts them all in the same failure domain.... Resolvers actually track capabilities of servers as it is the only
way to get answers due to firewalls dropping legitimate packet and protocol misimplementations. Add to that different vendors / versions supporting different extensions randomly flipping between vendors / versions is frought with danger unless you take extreme care.
Out of curiosity, do any resolvers other than BIND do this? I ask because BIND has a reputation for having "too many" features, and I wonder if this is one of them. Damian
On Aug 4, 2015, at 10:03 AM, Jay Ashworth <jra@baylink.com> wrote:
Everyone got BIND updated?
http://arstechnica.com/security/2015/08/exploits-start-against-flaw-that-c
ould-hamstring-huge-swaths-of-internet/
-- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On Wed, Aug 05, 2015 at 02:39:18AM +1000, Mark Andrews wrote:
In message <9C2ACA5A-755D-4FCF-8491-745A1F9111BA@puck.nether.net>, Jared Mauch writes:
I recommend using DNSDIST to balance traffic at a protocol level as you can h= ave implementation diversity on the backside.=20
I can send an example config out later for people. You can balance to bind N= SD and others all at the same time :-) just move your SPoF
Jared Mauch
Unless the same client hits the same server all the time this is a bad idea.
Software that can't handle the remote side having a upgrade/downgrade/capability change is broken.
Resolvers actually track capabilities of servers as it is the only way to get answers due to firewalls dropping legitimate packet and protocol misimplementations. Add to that different vendors / versions supporting different extensions randomly flipping between vendors / versions is frought with danger unless you take extreme care.
I've come to use DNSDist to workaround the problems that BIND has with outstanding queries which don't get a response. You might be surprised how poorly BIND performs if you use something else to take a look at it from the exterior. http://puck.nether.net/~jared/dnsdist.png The first two are BIND the 3rd is not and the 4th is BIND. The last 3 get the same types of queries, notice how BIND drops lots of queries. I don't have time to report all the DNS related issues on bind-users/dev but you may find it helpful to use a tool like this to at least identify what is going on. The last 3 servers get only domains like arpa and a few well known domains, eg: gmail. - Jared
On Aug 4, 2015, at 10:03 AM, Jay Ashworth <jra@baylink.com> wrote:
Everyone got BIND updated?
http://arstechnica.com/security/2015/08/exploits-start-against-flaw-that-c ould-hamstring-huge-swaths-of-internet/
-- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
-- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
Hi Jared, On 4 Aug 2015, at 12:00, Jared Mauch wrote:
I recommend using DNSDIST to balance traffic at a protocol level as you can have implementation diversity on the backside.
I can send an example config out later for people. You can balance to bind NSD and others all at the same time :-) just move your SPoF
As someone who once hosted TLD zones in a way that a query to a particular nameserver could be answered by either NSD or BIND9, my advice would be "don't do that". You're setting yourself up for troubleshooting hell. You can include different nameservers in the set for a single zone. Using different software for different nameservers can be sensible. Using different software for the same nameserver can be a nightmare. Joe
On Tue, Aug 04, 2015 at 01:48:56PM -0400, Joe Abley wrote:
Hi Jared,
On 4 Aug 2015, at 12:00, Jared Mauch wrote:
I recommend using DNSDIST to balance traffic at a protocol level as you can have implementation diversity on the backside.
I can send an example config out later for people. You can balance to bind NSD and others all at the same time :-) just move your SPoF
As someone who once hosted TLD zones in a way that a query to a particular nameserver could be answered by either NSD or BIND9, my advice would be "don't do that". You're setting yourself up for troubleshooting hell.
I'm not suggesting you have an unpredictable set of things you route queries to. I have a very simple config I'll share with you off-list. One should route things in a predictable manner. This is why people want operators who can code and operate a service vs just operate it, or just code. Those are the people in the highest demand in my narrow experience.
You can include different nameservers in the set for a single zone. Using different software for different nameservers can be sensible. Using different software for the same nameserver can be a nightmare.
Proper logging and instrumentation is essential. DNSDIST can be configured to fail over to something else while one server or daemon is offline and being serviced or restarted. This can also be done with other tools like "stupid routing tricks" aka anycast. For a resolver I want to "just work" for servers that need to do e-mail etc this works well for me. The fact I can have it point to a BIND process on localhost on a different port, or nsd, etc.. provides flexability that others don't do as easily. - Jared -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
As someone who once hosted TLD zones in a way that a query to a particular nameserver could be answered by either NSD or BIND9, my advice would be "don't do that". You're setting yourself up for troubleshooting hell.
for some folk, complexity is a career. i worked for circuitzilla for 15 months; it's embedded in their culture. randy
On Tue, Aug 04, 2015 at 12:00:32PM -0400, Jared Mauch wrote:
I recommend using DNSDIST to balance traffic at a protocol level as you can have implementation diversity on the backside.
Here's an example dnsdist config you might find helpful: This sends queries to the first two servers unless they are for domains in the "nether" pool list. They go to other servers. You can restrict access based on the Acl. newServer("x.x.223.10") newServer("x.x.223.20") ;setServerPolicy(firstAvailable) -- first server within its QPS limit setServerPolicy(leastOutstanding) webserver("0.0.0.0:8083", "AskMe") addACL("192.168.0.0/22") addACL("10.0.0.0/16") addACL("172.16.22.0/24") setKey("AskMe") controlSocket("127.0.0.1:1099") newServer{address="129.250.35.250", pool="nether"} newServer{address="129.250.35.251", pool="nether"} newServer{address="8.8.8.8", pool="nether"} addPoolRule({"ntt.net.", "nether.net."}, "nether") addPoolRule({"arpa.", "google.", "gmail.com.", "google.com.", "googlemail.com."}, "nether") -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
participants (19)
-
alvin nanog
-
Baldur Norddahl
-
Barry Shein
-
Christopher Morrow
-
Damian Menscher
-
Jared Mauch
-
Jared Mauch
-
Jay Ashworth
-
Jim Popovitch
-
Joe Abley
-
Joe Greco
-
Joel Maslak
-
Leonardo Oliveira Ortiz
-
Mark Andrews
-
Randy Bush
-
Roland Dobbins
-
Scott Helms
-
Stephane Bortzmeyer
-
Valdis.Kletnieks@vt.edu