DNS problems to RoadRunner - tcp vs udp
From what I have read, public DNS servers should support both UDP and TCP queries. TCP queries are often used when a UDP query fails, or if
I have seen intermittent problems on some client windows servers sending to rr.com recently. For example, the MX hosts for triad.rr.com are: # dig -t mx triad.rr.com ;; QUESTION SECTION: ;triad.rr.com. IN MX ;; ANSWER SECTION: triad.rr.com. 1609 IN MX 10 hrndva-smtpin01.mail.rr.com. triad.rr.com. 1609 IN MX 20 hrndva-smtpin02.mail.rr.com. The authoritative nameservers for mail.rr.com: # dig -t ns mail.rr.com ;; QUESTION SECTION: ;mail.rr.com. IN NS ;; ANSWER SECTION: mail.rr.com. 14204 IN NS cdptpa-admin02.mail.rr.com. mail.rr.com. 14204 IN NS hrndva-admin01.mail.rr.com. mail.rr.com. 14204 IN NS hrndva-admin02.mail.rr.com. mail.rr.com. 14204 IN NS cdptpa-admin01.mail.rr.com. All 4 of those queries will answer a UDP DNS query for host record hrndva-smtpin01.mail.rr.com. However, the hrndva-admin01.mail.rr.com and hrndva-admin02.mail.rr.com servers do not respond to TCP queries at all. Example: # dig hrndva-smtpin01.mail.rr.com @hrndva-admin01.mail.rr.com +tcp ; <<>> DiG 9.3.3rc2 <<>> hrndva-smtpin01.mail.rr.com @hrndva-admin01.mail.rr.com +tcp ; (1 server found) ;; global options: printcmd ;; connection timed out; no servers could be reached the answer is over a certain length. Any clues would be appreciated. Mark -- Mark Price Tranquil Hosting www.tqhosting.com
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Mark Price wrote: <SNIP>
From what I have read, public DNS servers should support both UDP and TCP queries. TCP queries are often used when a UDP query fails, or if the answer is over a certain length.
UDP is used for queries. TCP is used for zone transfers. If my server responded to TCP queries from anyone other than a secondary server, I would be VERY concerned. Jon Kibler - -- Jon R. Kibler Chief Technical Officer Advanced Systems Engineering Technology, Inc. Charleston, SC USA o: 843-849-8214 c: 843-224-2494 s: 843-564-4224 My PGP Fingerprint is: BAA2 1F2C 5543 5D25 4636 A392 515C 5045 CF39 4253 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkhSuR8ACgkQUVxQRc85QlNWGwCfUQFP7oNInCRZ72S2V2OSlE7Q IN4An3Ej+M3jsHFvHNHzl6UMYnczpv0v =GiEh -----END PGP SIGNATURE----- ================================================== Filtered by: TRUSTEM.COM's Email Filtering Service http://www.trustem.com/ No Spam. No Viruses. Just Good Clean Email.
On Fri, 13 Jun 2008 14:14:55 EDT, Jon Kibler said:
UDP is used for queries.
TCP is used for zone transfers.
It's also sometimes used if a reply doesn't fit in the 512 bytes for a UDP answer and EDNS0 isn't in effect. You get a truncated UDP packet back and re-ask the query over TCP.
Sorry to abuse the list, but aset.com seems to have some mail blocking issues: <Jon.Kibler@aset.com> (reason: 551 5.7.1 Message undeliverable. Please see: http://bounce.trustem.net/edu.php?id=m5DIJA6U012003.0.1... not accept email from DHCP connections with an academic institution supplied hostname (you.VT.EDU).) Jon? It's not a DHCP connection. It's a dedicated fixed IP address that hasn't moved in at least 6 years, it's in the DNS as such, and I'm curious what you based the "it's a DHCP" on? Incidentally, you may want to clean up your message, as the URL http://bounce.trustem.net/edu.php?id=m5DIJA6U012003.0.1... is unclear as to whether it has 0, 1, 2, or 3 trailing dots, and whether any trailing dots are an ellipsis rather than a part of the URL. Actually visiting that with all of 0..3 dots *all* fail with a "Oops link not found" screen, so it isn't like I can sort it out based on info your message provides.
Jon Kibler wrote:
UDP is used for queries.
TCP is used for zone transfers.
If my server responded to TCP queries from anyone other than a secondary server, I would be VERY concerned.
That is a common, but incorrect, assumption. DNS responses that are larger than the MTU of a single UDP packet are sent as TCP. Back in the day (c. 1998) Microsoft had some arpa zones that they felt it necessary to create hundreds of PTRs per entry. Of course, they denied TCP to their nameservers. The end result is that our BIND8 server was crashing on the lookups (it was a crappy port to NT).
Jon Kibler wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Mark Price wrote: <SNIP>
From what I have read, public DNS servers should support both UDP and TCP queries. TCP queries are often used when a UDP query fails, or if the answer is over a certain length.
UDP is used for queries.
TCP is used for zone transfers.
If my server responded to TCP queries from anyone other than a secondary server, I would be VERY concerned.
I see long TXT records from some DNSBLs that won't fit in a UDP packet on a daily basis. Certainly nothing to be concerned about. ~Seth
Date: Fri, 13 Jun 2008 14:14:55 -0400 From: Jon Kibler <Jon.Kibler@aset.com>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Mark Price wrote: <SNIP>
From what I have read, public DNS servers should support both UDP and TCP queries. TCP queries are often used when a UDP query fails, or if the answer is over a certain length.
UDP is used for queries.
Sometimes.
TCP is used for zone transfers.
Yes.
If my server responded to TCP queries from anyone other than a secondary server, I would be VERY concerned.
If it does not, you should be very concerned. The RFCs (several, but I'll point first to good old 1122) allow either TCP or UDP to be used for any operation that will fit in a 512 byte transfer. (EDNS0 allows larger UDP.) TCP is to be used any time a truncated bit is set in a replay. If you ever send a large reply that won't fit in 512 bytes, the request will be repeated using a TCP connection. If you ignore these, your DNS is broken. It is even allowed under the spec to start out with TCP, as AXFR queries typically do. Yes, I realize that this is fairly common and it does not break much, but, should DNSSEC catch on, you might just find the breakage a bit worse than it is today and there is no reason to have even the slight breakage that is there now. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kevin Oberman wrote:
If it does not, you should be very concerned. The RFCs (several, but I'll point first to good old 1122) allow either TCP or UDP to be used for any operation that will fit in a 512 byte transfer. (EDNS0 allows larger UDP.)
TCP is to be used any time a truncated bit is set in a replay. If you ever send a large reply that won't fit in 512 bytes, the request will be repeated using a TCP connection. If you ignore these, your DNS is broken. It is even allowed under the spec to start out with TCP, as AXFR queries typically do.
Yes, I realize that this is fairly common and it does not break much, but, should DNSSEC catch on, you might just find the breakage a bit worse than it is today and there is no reason to have even the slight breakage that is there now.
Okay, I stand corrected. I was approaching this from a security perspective only, and apparently based on incorrect information. But this leaves me with a couple of questions: Various hardening documents for Cisco routers specify the best practices are to only allow 53/tcp connections to/from secondary name servers. Plus, from all I can tell, Cisco's 'ip inspect dns' CBAC appears to only handle UDP data connections and anything TCP would be denied. From what you are saying, the hardening recommendations are wrong and that CBAC may break some DNS responses. Is this correct? Also, other than "That's what the RFCs call for," why use TCP for data exchange instead of larger UDP packets? Jon Kibler - -- Jon R. Kibler Chief Technical Officer Advanced Systems Engineering Technology, Inc. Charleston, SC USA o: 843-849-8214 c: 843-224-2494 s: 843-564-4224 My PGP Fingerprint is: BAA2 1F2C 5543 5D25 4636 A392 515C 5045 CF39 4253 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkhSwc4ACgkQUVxQRc85QlPzkACeOuKS3ni0uTNrjpcjY2tOZmc5 wbcAn1T85g7sBXkjOsWFENxWAtnT/kny =GlaW -----END PGP SIGNATURE----- ================================================== Filtered by: TRUSTEM.COM's Email Filtering Service http://www.trustem.com/ No Spam. No Viruses. Just Good Clean Email.
First: if you don't allow TCP queries, then you're going to break lots of recent applications for DNS. Second: unless your server and resolver support EDNS0, there is no way to increase the size of a UDP response, and even then, it's not large enough for many applications (ENUM, TXT, APL, etc.). TCP response to queries has been specified since RFC1035. The maximum message size is limited to 65535 bytes (due to the 16bit message size field before the header). RE the Cisco questions: this would not be the first time Cisco lagged in supporting enhanced services on the network.
-----Original Message----- From: Jon Kibler [mailto:Jon.Kibler@aset.com] Sent: Friday, June 13, 2008 11:52 AM To: Kevin Oberman Cc: nanog@merit.edu Subject: Re: DNS problems to RoadRunner - tcp vs udp
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Kevin Oberman wrote:
If it does not, you should be very concerned. The RFCs
I'll point first to good old 1122) allow either TCP or UDP to be used for any operation that will fit in a 512 byte transfer. (EDNS0 allows larger UDP.)
TCP is to be used any time a truncated bit is set in a replay. If you ever send a large reply that won't fit in 512 bytes, the request will be repeated using a TCP connection. If you ignore these, your DNS is broken. It is even allowed under the spec to start out with TCP, as AXFR queries typically do.
Yes, I realize that this is fairly common and it does not break much, but, should DNSSEC catch on, you might just find the breakage a bit worse than it is today and there is no reason to have even
(several, but the slight
breakage that is there now.
Okay, I stand corrected. I was approaching this from a security perspective only, and apparently based on incorrect information.
But this leaves me with a couple of questions:
Various hardening documents for Cisco routers specify the best practices are to only allow 53/tcp connections to/from secondary name servers. Plus, from all I can tell, Cisco's 'ip inspect dns' CBAC appears to only handle UDP data connections and anything TCP would be denied. From what you are saying, the hardening recommendations are wrong and that CBAC may break some DNS responses. Is this correct?
Also, other than "That's what the RFCs call for," why use TCP for data exchange instead of larger UDP packets?
Jon Kibler - -- Jon R. Kibler Chief Technical Officer Advanced Systems Engineering Technology, Inc. Charleston, SC USA o: 843-849-8214 c: 843-224-2494 s: 843-564-4224
My PGP Fingerprint is: BAA2 1F2C 5543 5D25 4636 A392 515C 5045 CF39 4253
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkhSwc4ACgkQUVxQRc85QlPzkACeOuKS3ni0uTNrjpcjY2tOZmc5 wbcAn1T85g7sBXkjOsWFENxWAtnT/kny =GlaW -----END PGP SIGNATURE-----
================================================== Filtered by: TRUSTEM.COM's Email Filtering Service http://www.trustem.com/ No Spam. No Viruses. Just Good Clean Email.
Jon Kibler wrote:
Various hardening documents for Cisco routers specify the best practices are to only allow 53/tcp connections to/from secondary name servers. Plus, from all I can tell, Cisco's 'ip inspect dns' CBAC appears to only handle UDP data connections and anything TCP would be denied. From what you are saying, the hardening recommendations are wrong and that CBAC may break some DNS responses. Is this correct?
A number of Cisco default from years gone by would break DSN, today, in it's current form. Such as how PIXs and ASAs with fixup/DPI would block udp/53 packets larger than 512 bytes, not permitting EDNS packets through.
Also, other than "That's what the RFCs call for," why use TCP for data exchange instead of larger UDP packets?
Justin Shore wrote:
Jon Kibler wrote:
Various hardening documents for Cisco routers specify the best practices are to only allow 53/tcp connections to/from secondary name servers. Plus, from all I can tell, Cisco's 'ip inspect dns' CBAC appears to only handle UDP data connections and anything TCP would be denied. From what you are saying, the hardening recommendations are wrong and that CBAC may break some DNS responses. Is this correct?
A number of Cisco default from years gone by would break DSN, today, in it's current form. Such as how PIXs and ASAs with fixup/DPI would block udp/53 packets larger than 512 bytes, not permitting EDNS packets through.
Thunderbird apparently thought that I was ready to send my message before I did. I was going to add some ASA config as an example. policy-map type inspect dns migrated_dns_map_1 parameters message-length maximum 2048 I don't have an IOS CBAC example but there's surely something similar. Justin
Jon Kibler <Jon.Kibler@aset.com> writes:
Okay, I stand corrected. I was approaching this from a security perspective only, and apparently based on incorrect information.
It always puzzles me when people say things like that - it's as if they've lost sight of the *whole point* of security being to protect services, data, etc. and ensure that they continue uninterrupted and uncorrupted. If one is approaching problems "from a security perspective only" with no concern to breaking services... wouldn't the even more secure solution involve just reaching for the power switch?
But this leaves me with a couple of questions:
Various hardening documents for Cisco routers specify the best practices are to only allow 53/tcp connections to/from secondary name servers. Plus, from all I can tell, Cisco's 'ip inspect dns' CBAC appears to only handle UDP data connections and anything TCP would be denied. From what you are saying, the hardening recommendations are wrong and that CBAC may break some DNS responses. Is this correct?
I bet if you look in these same hardening documents you'll find suggestions for static bogon filters (which become stale over time), filtering all ICMP (breaks PMTUD) and all sorts of other jack moves.
Also, other than "That's what the RFCs call for," why use TCP for data exchange instead of larger UDP packets?
Well, the inheritance is from the "default IP maximum datagram size" of 576 bytes, RFC879, which a host must observe absent specific knowledge that the far end can do better. In the vast majority of cases (assuming your resolver and cacheing nameserver won't puke) I suspect there would be no problem sending a somewhat-bigger-than-this-size DNS reply... right up till you go over 1500 bytes for your datagram, at which point you're back to square one. With cryptologically signed replies containing a lot of AAAA records and other data, bigger than 1500 bytes is certainly not outside the realm of possibility. Trying to fragment and reassemble DNS queries is a step away from goodness. With TCP, you have a virtual stream service rather than a datagram service, and in exchange for the overhead of setting up and tearing down the connection, one gets the ability to do transactions that are much larger. ---Rob
Jon Kibler writes:
Also, other than "That's what the RFCs call for," why use TCP for data exchange instead of larger UDP packets?
TCP is more robust for large (>Path MTU) data transfers, and less prone to spoofing. A few months ago I sent a message to SwiNOG (like NANOG only less North American and more Swiss) about this topic, trying to explain some of the tradeoffs: http://www.mail-archive.com/swinog@lists.swinog.ch/msg02612.html Mostly I think that people "approaching this from a security perspective only" often forget that by fencing in the(ir idea of the) current status quo, they often prevent beneficial evolution of protocols as well, contributing to the Internet's "ossification". -- Simon.
Mostly I think that people "approaching this from a security perspective only" often forget that by fencing in the(ir idea of the) current status quo, they often prevent beneficial evolution of protocols as well, contributing to the Internet's "ossification".
folk do not always get the implications of the internet being a 'disruptive technology,' and that this is a good thing which needs to be preserved and even enhanced. they use skype and want to block ports. it's rampant. the old siliness of blocking tcp/53 is just one of the corner cases that keeps popping up publicly. try using this year's crop of innovative apps from behind some corporate firewall. packet/port xenophobia overrides the users' desire to be productive every time. it departments are paid to minimize cost and risk, not maximize workers' productivity. randy
On Fri, Jun 13, 2008 at 02:14:55PM -0400, Jon Kibler wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Mark Price wrote: <SNIP>
From what I have read, public DNS servers should support both UDP and TCP queries. TCP queries are often used when a UDP query fails, or if the answer is over a certain length.
UDP is used for queries.
TCP is used for zone transfers.
If my server responded to TCP queries from anyone other than a secondary server, I would be VERY concerned.
Red alert: [cookiemonster:~] owens% dig +tcp aset.com @209.190.93.130 soa ; <<>> DiG 9.4.2 <<>> +tcp aset.com @209.190.93.130 soa ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5864 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 2 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;aset.com. IN SOA ;; ANSWER SECTION: aset.com. 14400 IN SOA ns1.sims.net. hostmaster.aset.com. 2006111001 10800 3600 3600000 86400 ;; AUTHORITY SECTION: aset.com. 14400 IN NS ns3.trustns.net. aset.com. 14400 IN NS ns1.sims.net. aset.com. 14400 IN NS ns1.trustns.net. aset.com. 14400 IN NS ns2.sims.net. aset.com. 14400 IN NS ns2.trustns.net. ;; ADDITIONAL SECTION: ns1.sims.net. 86400 IN A 209.190.93.130 ns2.sims.net. 86400 IN A 209.190.93.132 ;; Query time: 31 msec ;; SERVER: 209.190.93.130#53(209.190.93.130) ;; WHEN: Fri Jun 13 14:31:13 2008 ;; MSG SIZE rcvd: 211 Bill.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bill Owens wrote:
On Fri, Jun 13, 2008 at 02:14:55PM -0400, Jon Kibler wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Mark Price wrote: <SNIP>
From what I have read, public DNS servers should support both UDP and TCP queries. TCP queries are often used when a UDP query fails, or if the answer is over a certain length.
UDP is used for queries.
TCP is used for zone transfers.
If my server responded to TCP queries from anyone other than a secondary server, I would be VERY concerned.
Red alert:
[cookiemonster:~] owens% dig +tcp aset.com @209.190.93.130 soa
; <<>> DiG 9.4.2 <<>> +tcp aset.com @209.190.93.130 soa ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5864 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 2 ;; WARNING: recursion requested but not available
;; QUESTION SECTION: ;aset.com. IN SOA
;; ANSWER SECTION: aset.com. 14400 IN SOA ns1.sims.net. hostmaster.aset.com. 2006111001 10800 3600 3600000 86400
;; AUTHORITY SECTION: aset.com. 14400 IN NS ns3.trustns.net. aset.com. 14400 IN NS ns1.sims.net. aset.com. 14400 IN NS ns1.trustns.net. aset.com. 14400 IN NS ns2.sims.net. aset.com. 14400 IN NS ns2.trustns.net.
;; ADDITIONAL SECTION: ns1.sims.net. 86400 IN A 209.190.93.130 ns2.sims.net. 86400 IN A 209.190.93.132
;; Query time: 31 msec ;; SERVER: 209.190.93.130#53(209.190.93.130) ;; WHEN: Fri Jun 13 14:31:13 2008 ;; MSG SIZE rcvd: 211
UGH. Apparently hosting provider must have messed with IPTABLES on that system. Thanks for the heads up. (Open mouth, insert foot.) Jon Kibler - -- Jon R. Kibler Chief Technical Officer Advanced Systems Engineering Technology, Inc. Charleston, SC USA o: 843-849-8214 c: 843-224-2494 s: 843-564-4224 My PGP Fingerprint is: BAA2 1F2C 5543 5D25 4636 A392 515C 5045 CF39 4253 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkhSww0ACgkQUVxQRc85QlNk5wCfZT8s3CYDjb3lj86xU/k1N2+m 1O8AnAuSLaFthAwmBwUAmNS0MePFo/SF =/Ol5 -----END PGP SIGNATURE----- ================================================== Filtered by: TRUSTEM.COM's Email Filtering Service http://www.trustem.com/ No Spam. No Viruses. Just Good Clean Email.
On Fri, 13 Jun 2008 14:14:55 -0400 Jon Kibler <Jon.Kibler@aset.com> wrote:
TCP is used for zone transfers. If my server responded to TCP queries from anyone other than a secondary server, I would be VERY concerned.
I wouldn't be unless it looked like a DDoS - and it might for some that are seeing the results of a DNS-based DDoS mitigation device you or an upstream put in for the first time. These boxes force clients to switch over from UDP to TCP for queries when a well formed UDP DNS attack hits. John
Not to toss flammables onto the pyre. BUT there is a large difference from what the RFC's allow and common practice. In our shop TCP is blocked to all but authoratative secondaries as TCP is sinply too easy to DoS a DNS server with. We simply don't need a few thousand drones clogging the TCP connection table all trying to do zone transfers ( yes it happened and logs show drones are still trying ) For a long time there has been a effective practice of UDP == resolution requests TCP == zone transfers It would have been better if a separate port had been defined for zone transfers as that would obviate the need for a application layer gateway to allow TCP transfers so that zone transfers can be blocked and resolution requests allowed for now all TCP is blocked. Now just because someone has a bright idea they drag out a 20 y/o RFC and say SEE, SEE you must allow this because the RFC says so all the while ignoring the 20 years of operational discipline that RFC was written when the internet was like the quad at college everyone knew one and other and we were all working towards a common goal of interoperability and open systems , These days the net is more like a seedy waterfront after midnight where criminal gangs are waiting to ambush the unwary and consequently networks need to be operated from that standpoint. At the University networking level it is extremely difficult as we need to maintain a open network as much as possible but protect our infrastructure services so that they have 5 nines of availability back in the day a few small hosts would serve DNS nicely and we did not have people trying to take them down and/or infecting local hosts and attempting DHCP starvation attacks. And no we are not at the 5 nines level but we are working on it. - Scott Randy Bush wrote:
If my server responded to TCP queries from anyone other than a secondary server, I would be VERY concerned.
you may want to read the specs
randy
Scott McGrath wrote: [..]
For a long time there has been a effective practice of
UDP == resolution requests TCP == zone transfers
WRONG. TCP is there as a fallback when the answer of the question is too large. Zone transfer you can limit in your software. If you can't configure your dns servers properly then don't run DNS. Also note that botnets have much more effective ways of taking you out. And sometimes domains actually require TCP because there are too many records for a label eg http://stupid.domain.name/node/651 If you are thus blocking TCP for DNS resolution you suddenly where blocking google and thus for some people "The Internet". Also see: http://homepages.tesco.net/J.deBoynePollard/FGA/dns-edns0-and-firewalls.html (Which was the second hit for google(EDNS0) after a link to RFC2671) Greets, Jeroen
There is no call for insults on this list - Rather thought this list was about techincal discussions affecting all of us and keeping DNS alive for the majority of our customers certainly qualifies. We/I am more than aware of the DNS mechanisms and WHY there are there trouble is NO DNS server can handle directed TCP attacks even the root servers crumbled under directed botnet activity and we have taken the decision to accept some collateral damage in order to keep services available. We are a well connected university network with multi-gigabit ingress and egress with 10G on Abilene so we try to protect the internet from attacks originating within our borders AND we really feel the full wrath of botnets as we do not have a relatively slow WAN link to buffer the effects. Yes - we are blocking TCP too many problems with drone armies and we started about a year ago when our DNS servers became unresponsive for no apparent reason. Investigation showed TCP flows of hundreds of megabits/sec and connection table overflows from tens of thousands of bots all trying to simultaneously do zone transfers and failing tried active denial systems and shunning with limited effectiveness. We are well aware of the host based mechanisms to control zone information, Trouble is with TCP if you can open the connection you can DoS so we don't allow the connection to be opened and this is enforced at the network level where we can drop at wire speed. Open to better ideas but if you look at the domain in my email address you will see we are a target for hostile activity just so someone can 'make their bones'. Also recall we have a comittment to openess so we would like to make TCP services available but until we have effective DNS DoS mitigation which can work with 10Gb links It's not going to happen. - Scott Jeroen Massar wrote:
Scott McGrath wrote: [..]
For a long time there has been a effective practice of
UDP == resolution requests TCP == zone transfers
WRONG. TCP is there as a fallback when the answer of the question is too large. Zone transfer you can limit in your software. If you can't configure your dns servers properly then don't run DNS. Also note that botnets have much more effective ways of taking you out.
And sometimes domains actually require TCP because there are too many records for a label eg http://stupid.domain.name/node/651 If you are thus blocking TCP for DNS resolution you suddenly where blocking google and thus for some people "The Internet".
Also see: http://homepages.tesco.net/J.deBoynePollard/FGA/dns-edns0-and-firewalls.html
(Which was the second hit for google(EDNS0) after a link to RFC2671)
Greets, Jeroen
Scott McGrath wrote:
There is no call for insults on this list
Insults? Where? If you feel insulted by any of the comments made on this list by people, then you probably are indeed on the wrong list. But that is just me.
- Rather thought this list was about techincal discussions affecting all of us and keeping DNS alive for the majority of our customers certainly qualifies.
[..blabber about DNS attacks over TCP..] If I where a botnet herder and I had to take out your site and I was going to pick TCP for some magical reason then I would not care about your DNS servers, I would just hit your webservers, hard. I mean just the 'index.html' (http://www.harvard.edu/) is 24Kb, that is excluding pictures and there is bound to be larger data there which you are going to send and the bots only have to say "ACK" to once in a while. Multiply that by say a small botnet of 1M hosts, each just requests that 24Kb file. You will have a million flows and won't have any way to rate limit that or control it. Your link was already full trying to send it back to the clients and next to that your server was probably not able to process it in the first place. Simple, effective, nothing you can do about it, except get way and way more hardware. If somebody wants to take you out, they will take you out. Just get one other box with 10GE (not too hard to do) or just get a million of them with a little bit of connectivity (which is quite easy apparently)...
We/I am more than aware of the DNS mechanisms and WHY there are there trouble is NO DNS server can handle directed TCP attacks even the root servers crumbled under directed botnet activity and we have taken the decision to accept some collateral damage in order to keep services available.
"The root servers crumbled" wow, I must have missed somebody taking out all the 13 separate and then individually anycasted root servers. Which btw only do UDP as currently '.' is still small enough. $ dig @a.root-servers.net. . NS +tcp [..] ;; Query time: 95 msec ;; SERVER: 2001:503:ba3e::2:30#53(2001:503:ba3e::2:30) ;; WHEN: Sat Jun 14 23:45:52 2008 ;; MSG SIZE rcvd: 604 That is only 1 packet to 1 packet, still only 500 bytes. While your little webserver would generate 24kb for that same sequence.
We are a well connected university network with multi-gigabit ingress and egress with 10G on Abilene so we try to protect the internet from attacks originating within our borders AND we really feel the full wrath of botnets as we do not have a relatively slow WAN link to buffer the effects.
The whole point generally of botnets is just the Denial of Service (DoS), if that is because your link is full or the upstreams link is full or because the service can't service clients anymore. But clearly, as you are blocking TCP-DNS you are DoSing yourself already, so the botherders win. Also note that Abilene internally might be 10G and in quite some places even 40G, but you still have to hand it off to the rest of the world and those will count as those 'slow WAN' links that you think everybody else on this planet is behind. (Hint: 10GE is kinda the minimum for most reasonably sized ISP's)
Yes - we are blocking TCP too many problems with drone armies and we started about a year ago when our DNS servers became unresponsive for no apparent reason. Investigation showed TCP flows of hundreds of megabits/sec and connection table overflows from tens of thousands of bots all trying to simultaneously do zone transfers and failing tried active denial systems and shunning with limited effectiveness.
How is a failed AXFR going to generate a lot of traffic, unless they are repeating themselves over and over and over again? Thus effectively just packeting you? Also, are you talking about Recursive or Authoritive DNS servers here? Where those bots on your network, or where they remote?
We are well aware of the host based mechanisms to control zone information, Trouble is with TCP if you can open the connection you can DoS so we don't allow the connection to be opened and this is enforced at the network level where we can drop at wire speed.
Do you mean that the hosts which do TCP are allowed to do transfers or not? As in the latter case they can't generate big answers, they just get 1 packet back and then end then FIN. Note also, that if they are simply trying to overload your hosts, UDP is much more effective in doing that already and you have that hole wide open apparently otherwise you wouldn't have DNS.
Open to better ideas but if you look at the domain in my email address you will see we are a target for hostile activity just so someone can 'make their bones'.
It probably has nothing to do with the domain name, it more likely has something to do with certain services that are available or provided on your network.
Also recall we have a comittment to openess so we would like to make TCP services available but until we have effective DNS DoS mitigation which can work with 10Gb links It's not going to happen.
You think that 10Gb is a 'fat link', amusing ;) There are various vendors, most likely also reading on this list, who can be more than helpful in providing you with all kinds of bad, but also a couple of good solutions to most networking issues that you are apparently having. But the biggest issue you seem to have is not knowing what the DoS kiddies want to take out and why they want to take it out. Greets, Jeroen PS: You do know that an "NS" record is not allowed to point to a CNAME I hope? (NS3.harvard.edu CNAME ns3.br.harvard.edu. RFC1912 2.4 ;)
On Sat, 14 Jun 2008, Scott McGrath wrote:
Also recall we have a comittment to openess so we would like to make TCP services available but until we have effective DNS DoS mitigation which can work with 10Gb links It's not going to happen.
I feel your pain, but I think there may be a slight mis-analysis of the situation. However I may be mistaken, given the lack of details. The 10Gb really doesn't have much to do with tcp-state-table problems. Any network with a large user population probably should have separate DNS servers for their authoritative zones answering the Internet at-large and their recursive resolvers serving their user population. DNS recursive resolvers may not need to answer unsolicited queries from the Internet at large. It may make sense to keep those servers behind stateful packet gateways, and only allow both UDP and TCP responses from the Internet to UDP and TCP queries made by the local, authorized users. Because you don't know what Answer all the other DNS servers may give, including a Truncated answer, recursive resolvers must be able to use TCP to send queries to the Internet at large, and receive TCP queries from its local, authorized user population. If your own local users are DOSing your own DNS recursive resolvers, hopefully that's your own problem. A DNS authoritative server may only need to answer unsolicited UDP queries from the Internet at large. Because DNS clients (stub, resolvers) must send a query as UDP first, and may use TCP if the Answer has the truncated bit set, an authoritative name server which knows all its answers will always fit in the minimum DNS Answer and never sets the truncated bit shouldn't get a TCP DNS query. RFC1112 says DNS servers should answer unsolicited TCP DNS queries anyway, but its not a MUST and it may rate limit its TCP answers. Given those constraints, it may make sense for DNS authoritative servers to limit TCP, either with an ACL or rate-limit the TCP/SYNs. But its only a medium term solution. DNS answers are growing. Someday those DNS authoritative servers probaly will need to send a large DNS Answer. But that is under the control of the local DNS administrator. So hopefully he or she will know when the DNS server breaks, and will fix it then. Also, modern TCP/IP stacks and modern name server implementations don't have as many tcp-state-table issues as they did at the beginning of the decade. Any DOS attack based on TCP would disrupt HTTP/Web servers just as much as TCP/DNS servers. So many of the same mitigation techniques (and attacks) for Web servers may be applicable to DNS servers. So briefly 1. Separate your authoritative and recursive name servers 2. Recursive name servers should only get replies to their own DNS queries from the Internet, they can use both UDP and TCP 3. Recursive name servers should only get queries from their own user population, they can use both UDP and TCP 4. Authoritative servers may only need to answer UDP queries from the Internet, if they never truncates its Answers. But the DNS administrator should plan what to do when its Answers get too large. Most DNS servers don't provide good alerts to DNS administrators doing stupid things, like sending big DNS answers while blocking TCP. I tried to capture some of these ideas in some ACLs <http://www.donelan.com/dnsacl.html>
Sean Donelan wrote:
1. Separate your authoritative and recursive name servers 2. Recursive name servers should only get replies to their own DNS queries from the Internet, they can use both UDP and TCP
We've just completed a project to separate our authoritative and recursive servers and I have a couple notes... 1) For the recursive-only, we're using a combination of BIND's "query-source address a.b.c.d" and "listen-on e.f.g.h" in the hopes of providing some additional measure of protection against cache poisoning. The "listen-on" IPs are ACL'd at the borders so non-clients cannot get ANY packets to them. The "query-source address" itself doesn't appear in the "listen-on" list either and won't respond to queries. I know this isn't foolproof, but it probably raises the bar slightly against off-net poisoning attempts. 2) The biggest drawback to separation after years of service is that customers have come to expect their DNS changes are propagated instantly when they are on-net. This turns out to be more of an annoyance to us than our customers, since our zone is probably the most frequently updated. 3) I've gone so far as to remove the root hint zone from our auth-only boxes, again out of paranoia ("recursion no" does the trick, this is just an extra bit of insurance against someone flipping that bit due to a lack of understanding of the architecture). There is one third party we have to use an 'also-notify' by IP address in this case for their zone. Mike
On 15/06/2008, at 12:45 PM, Mike Lewinski wrote:
2) The biggest drawback to separation after years of service is that customers have come to expect their DNS changes are propagated instantly when they are on-net. This turns out to be more of an annoyance to us than our customers, since our zone is probably the most frequently updated.
If you're running bind for your recursive boxes, `rndc -s <server> flushname <zone>' run against each recursive box should do the trick for bind 9.3.0 upwards. It will flush the cache for that zone only. There was a bug where it wouldn't flush negative caches, but that might be fixed. YMMV, etc. Usual common sense warnings apply. -- Nathan Ward
In article <48546625.6040301@rockynet.com> you write:
Sean Donelan wrote:
1. Separate your authoritative and recursive name servers 2. Recursive name servers should only get replies to their own DNS queries from the Internet, they can use both UDP and TCP
We've just completed a project to separate our authoritative and recursive servers and I have a couple notes...
1) For the recursive-only, we're using a combination of BIND's "query-source address a.b.c.d" and "listen-on e.f.g.h" in the hopes of providing some additional measure of protection against cache poisoning. The "listen-on" IPs are ACL'd at the borders so non-clients cannot get ANY packets to them. The "query-source address" itself doesn't appear in the "listen-on" list either and won't respond to queries. I know this isn't foolproof, but it probably raises the bar slightly against off-net poisoning attempts.
Named will reject queries on the *-source sockets. It will also drop responses on the listening sockets provided you havn't set the query-souce port to port 53.
2) The biggest drawback to separation after years of service is that customers have come to expect their DNS changes are propagated instantly when they are on-net. This turns out to be more of an annoyance to us than our customers, since our zone is probably the most frequently updated.
Querying for type SOA at the name will prevent named caching negative responses and still allow existance tests to be made. nsupdate makes SOA queries to workout which zone needs to be updated and to also determine which server to send the updates to. We realised a long time ago that we needed to have a way to find the containing zone that didn't result in caches being filled with the side effects of that discover mechanism. Named, by default, sets the ttl to zero on negative responses to SOA queries.
3) I've gone so far as to remove the root hint zone from our auth-only boxes, again out of paranoia ("recursion no" does the trick, this is just an extra bit of insurance against someone flipping that bit due to a lack of understanding of the architecture). There is one third party we have to use an 'also-notify' by IP address in this case for their zone.
Authoritative only servers need hints so that NOTIFY will work in the general case. Eventually, they will also need them so we can get rid of IP addresses in masters clauses on slave/stub zones. This will help reduce the costs in renumbering.
Mike
Mark
Mark Andrews wrote:
Authoritative only servers need hints so that NOTIFY will work in the general case.
Presumably that's because the authoritative server will want to look up the RDATA (hostname) of each NS record that serves a zone for which it is authoritative. Could you avoid this if you used something like 'notify explicit' and specified all slave servers by IP address in an also-notify clause?
Eventually, they will also need them so we can get rid of IP addresses in masters clauses on slave/stub zones. This will help reduce the costs in renumbering.
Would an administrator still have the option of specifying masters by IP address if they desire, and therefore remove the need for the hints file? It seems that this would at least give the option of not only forcing recursion off, even if someone turns it on by accident (as Mike notes), but it also should help reduce the potential for reflection attacks from authoritative servers giving upward referrals for out-of-zone queries, no? michael
* Sean Donelan:
Any network with a large user population probably should have separate DNS servers for their authoritative zones answering the Internet at-large and their recursive resolvers serving their user population.
It's not so much a question of network size. You absolutely must use different views if you host DNS for customer domains because there is a race conidtion in the delegation provisioning protocol used by most TLDs (you need to add the domain before you receive the delegation).
On 15/06/2008, at 9:18 AM, Scott McGrath wrote:
Yes - we are blocking TCP too many problems with drone armies and we started about a year ago when our DNS servers became unresponsive for no apparent reason. Investigation showed TCP flows of hundreds of megabits/sec and connection table overflows from tens of thousands of bots all trying to simultaneously do zone transfers and failing tried active denial systems and shunning with limited effectiveness.
We are well aware of the host based mechanisms to control zone information, Trouble is with TCP if you can open the connection you can DoS so we don't allow the connection to be opened and this is enforced at the network level where we can drop at wire speed. Open to better ideas but if you look at the domain in my email address you will see we are a target for hostile activity just so someone can 'make their bones'.
There's really two problems here - one is packet/bit rate causing problems for your network, that's not necessarily an end system thing. Not really DNS specific, and blocking 53/TCP doesn't really help here as people could just send 53/UDP your way and get the same effect. Connection table overflowing is a bit of a different issue, obvious way to overcome that is to whack a load balancer in there to share the load around. It's not immediately obvious to me why your connection table would be filling up - what state were connections stuck in? Anyway, one thought that comes to me would be to split off UDP and TCP services to different servers - if some TCP attack kills your TCP DNS server you: a) don't have to worry about UDP services failing. b) can turn it off for the duration of the attack, and are no worse off than you are right now, then turn it back on when you see the high volume of SYN messages disappear. c) as TCP DNS service recovery isn't super time critical (I'm assuming this, because you're not running it at all right now) you have time to look at the anatomy of the attack and figure out how to filter it more precisely if possible, instead of simply dropping all TCP. Obviously, you'd want to make sure TCP from your other name servers always goes to the UDP one, etc. etc. -- Nathan Ward
All, Thanks for the helpful suggestions. For what it's worth we use Cisco's CNR as we operate a MAC registration system which controls access to our network. We allow customers to select hostnames which are pushed into DDNS when the the system acquires a lease. CNR has internal limits (user configurable) which control the TCP state machine and these are easy to overwhelm as once you hit the high limit the server process stops accepting new connection requests for any reason until the connections go below the max limit once again. We have been in constant contact with the development group on defending these machines from DDoS activity. UDP is somewhat easier due to our network structure than TCP to rate limit and we do operate microflow policers to limit UDP activity from any given host. We once used BIND but bind could not handle the DDNS updates in a reasonable fashion as we have many short lived connections as students access the wireless network between classes hence the move to CNR which handles DDNS effectively but does not like TCP based attacks Unlike MIT over the river Harvard only has 2 Class B's available and we have many more registered clients than we have IP space for and a community which requires fixed hostnames for academic reasons and since we cannot assign static IP assignments except to well known and fixed services this becomes problematic hence DDNS which as many have pointed out here is painful from a operational standpoint but in our environment it is a lifesaver. Unfortunately we have needed to insert some controlled breakage into the network to keep the services our customers require alive as TCP SYN attacks are unfortunately still effective in this day and age we have tried many things our latest foray into TCP control is creating a Snort infrastructure which is sufficient to monitor all flows ingressing and egressing our network and from there based on analysis of the data applying rules to limit traffic in real time from ill behaved TCP hosts as our long term goal is not to operate a corporate network locked into stupid mode with no understanding of protocol needs - Scott Nathan Ward wrote:
On 15/06/2008, at 9:18 AM, Scott McGrath wrote:
Yes - we are blocking TCP too many problems with drone armies and we started about a year ago when our DNS servers became unresponsive for no apparent reason. Investigation showed TCP flows of hundreds of megabits/sec and connection table overflows from tens of thousands of bots all trying to simultaneously do zone transfers and failing tried active denial systems and shunning with limited effectiveness.
We are well aware of the host based mechanisms to control zone information, Trouble is with TCP if you can open the connection you can DoS so we don't allow the connection to be opened and this is enforced at the network level where we can drop at wire speed. Open to better ideas but if you look at the domain in my email address you will see we are a target for hostile activity just so someone can 'make their bones'.
There's really two problems here - one is packet/bit rate causing problems for your network, that's not necessarily an end system thing. Not really DNS specific, and blocking 53/TCP doesn't really help here as people could just send 53/UDP your way and get the same effect.
Connection table overflowing is a bit of a different issue, obvious way to overcome that is to whack a load balancer in there to share the load around. It's not immediately obvious to me why your connection table would be filling up - what state were connections stuck in?
Anyway, one thought that comes to me would be to split off UDP and TCP services to different servers - if some TCP attack kills your TCP DNS server you: a) don't have to worry about UDP services failing. b) can turn it off for the duration of the attack, and are no worse off than you are right now, then turn it back on when you see the high volume of SYN messages disappear. c) as TCP DNS service recovery isn't super time critical (I'm assuming this, because you're not running it at all right now) you have time to look at the anatomy of the attack and figure out how to filter it more precisely if possible, instead of simply dropping all TCP.
Obviously, you'd want to make sure TCP from your other name servers always goes to the UDP one, etc. etc.
-- Nathan Ward
There is no call for insults on this list - Rather thought this list was about techincal discussions affecting all of us and keeping DNS alive for the majority of our customers certainly qualifies.
We/I am more than aware of the DNS mechanisms and WHY there are there trouble is NO DNS server can handle directed TCP attacks even the root servers crumbled under directed botnet activity and we have taken the decision to accept some collateral damage in order to keep services available. We are a well connected university network with multi-gigabit ingress and egress with 10G on Abilene so we try to protect the internet from attacks originating within our borders AND we really feel the full wrath of botnets as we do not have a relatively slow WAN link to buffer the effects.
Yes - we are blocking TCP too many problems with drone armies and we started about a year ago when our DNS servers became unresponsive for no apparent reason. Investigation showed TCP flows of hundreds of megabits/sec and connection table overflows from tens of thousands of bots all trying to simultaneously do zone transfers and failing tried active denial systems and shunning with limited effectiveness.
We are well aware of the host based mechanisms to control zone information, Trouble is with TCP if you can open the connection you can DoS so we don't allow the connection to be opened and this is enforced at the network level where we can drop at wire speed. Open to better ideas but if you look at the domain in my email address you will see we are a target for hostile activity just so someone can 'make their bones'.
Also recall we have a comittment to openess so we would like to make TCP services available but until we have effective DNS DoS mitigation which can work with 10Gb links It's not going to happen.
This could be a real problem. Of course, one solution is to take a shotgun, point it at your foot, and pull the trigger. As you noted, a long history of operational experience suggests that most resolver traffic is UDP, which means that you won't have that many problems from doing this. However, you can kiss that "5 nines of availability" you claim to be interested in goodbye. Nathan Ward and Sean Donelan have covered most of the points I would have made had I written this message earlier. I will also point out, however, that you don't even necessarily need a load balancer to do the tcp/53 screening, but there are a lot of neat and clever things that you could do (some depending on with or without). Given the relatively low level of usage for TCP DNS lookups, plus the fact that it ought to be trivial to automatically identify hosts that are doing TCP things that they shouldn't be doing (one zone transfer request might be a reasonable thing, as many old-timers might do that sort of thing, but won't try repeatedly if the server refuses), it kind of mystifies me as to why you seem to imply that this is so difficult. For example, it only took a few minutes to come up with this, for FreeBSD, which takes advantage of the radix table support in ipfw. #! /bin/sh - grep -i axfr /var/log/messages | grep "zone transfer .* denied" | ( while read a b c d e f g h i j k l m n; do g=`echo "${g}" | sed 's:#.*::'` # Maybe do some other processing ... a bad example: log it # echo noticed "${g}" "${j}" | logger -p info -t protector # Put out the IP's we want to consider filtering echo "${g}" done ) | sort | uniq -c | ( while read a b; do if [ "${a}" -gt 3 ]; then 2> /dev/null ipfw table 1 add "${b}"/32 && (echo banned "${b}" "${a}" | logger -p warning -t protector) fi done ) Needs to have a rule like this installed in the system firewall table: % ipfw add [nnn] deny tcp from table\(1\) to me 53 I think a real solution would be more sophisticated than this, but it's a starting point. This sort of thing limits "collateral damage" to hosts that have attempted an unreasonable number of zone transfers, which is still unpleasant, but is less likely to break legitimate uses. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.
On Jun 15, 2008, at 8:02 PM, Joe Greco wrote:
I think a real solution would be more sophisticated than this, but it's a starting point.
In addition to the BCPs already mentioned by Sean and Nathan, a good detection/classification/traceback system plus S/RTBH can be helpful, and there are commercial DDoS mitigation services/scrubbers available from various SPs/vendors which have DNS-specific functionality, as well. Blocking TCP/53 is definitely not an optimal solution, as many have already pointed out. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@cisco.com> // +66.83.266.6344 mobile History is a great teacher, but it also lies with impunity. -- John Robb
* Jon Kibler:
From what I have read, public DNS servers should support both UDP and TCP queries. TCP queries are often used when a UDP query fails, or if the answer is over a certain length.
UDP is used for queries.
TCP is used for zone transfers.
I've seen such claims countless times. 8-( However, you can perform zone file transfers over UFP in some cases (using IXFR), and TCP fallback may happen even if you never respond with packets with the TC bit set.
participants (24)
-
Bill Owens
-
Florian Weimer
-
Jeroen Massar
-
Joe Greco
-
John Kristoff
-
Jon Kibler
-
Justin Shore
-
Kevin Oberman
-
Mark Andrews
-
Mark Price
-
Michael Sinatra
-
Mike Lewinski
-
Nathan Ward
-
Randy Bush
-
Robert E. Seastrom
-
Roland Dobbins
-
Scott C. McGrath
-
Scott McGrath
-
Sean Donelan
-
Seth Mattinen
-
Simon Leinen
-
Tomas L. Byrnes
-
Tony Rall
-
Valdis.Kletnieks@vt.edu