Everything else remaining equal...is there a standard or expectation for DNS reliability? 98% 99% 99.5% 99.9% 99.99% 99.999% Measured in queries completed vs. queries lost. Whats the consensus? -- Phil Fagan Denver, CO 970-480-7618
To me anything below 99.99% is unacceptable. 100 failures out of 100,000 queries still seems like a lot especially if its not network related. So I would say 99.999% would be what I would look for. Thanks On Thu, Sep 12, 2013 at 2:03 PM, Phil Fagan <philfagan@gmail.com> wrote:
Everything else remaining equal...is there a standard or expectation for DNS reliability?
98% 99% 99.5% 99.9% 99.99% 99.999%
Measured in queries completed vs. queries lost.
Whats the consensus?
-- Phil Fagan Denver, CO 970-480-7618
-- -------------------- Bryan Tong Nullivex LLC | eSited LLC (507) 298-1624
I go with 99.999% given that you have a good number of DNS Servers (anycasted). On Thu, Sep 12, 2013 at 9:03 PM, Phil Fagan <philfagan@gmail.com> wrote:
Everything else remaining equal...is there a standard or expectation for DNS reliability?
98% 99% 99.5% 99.9% 99.99% 99.999%
Measured in queries completed vs. queries lost.
Whats the consensus?
-- Phil Fagan Denver, CO 970-480-7618
-- () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments Disclaimer: http://goldmark.org/jeff/stupid-disclaimers/
Its a good point about the anycast; 99.999% should be expected. On Thu, Sep 12, 2013 at 2:14 PM, Beavis <pfunix@gmail.com> wrote:
I go with 99.999% given that you have a good number of DNS Servers (anycasted).
On Thu, Sep 12, 2013 at 9:03 PM, Phil Fagan <philfagan@gmail.com> wrote:
Everything else remaining equal...is there a standard or expectation for DNS reliability?
98% 99% 99.5% 99.9% 99.99% 99.999%
Measured in queries completed vs. queries lost.
Whats the consensus?
-- Phil Fagan Denver, CO 970-480-7618
-- () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments
Disclaimer: http://goldmark.org/jeff/stupid-disclaimers/
-- Phil Fagan Denver, CO 970-480-7618
Remember though that anycast only solves for availability in one layer of the system and it is not difficult to create a less available anycast presence if you do silly things with the way you manage your routes. A system is only as available as the least available layer in that system For example, if you use an automated system that changes your route advertisements and that system encounters a defect that breaks your announcements then although a well built anycast footprint might acheive 99.999, a poorly implemented management system that is less available and creates an outage would reduce the number. On Thu, Sep 12, 2013 at 4:25 PM, Phil Fagan <philfagan@gmail.com> wrote:
Its a good point about the anycast; 99.999% should be expected.
On Thu, Sep 12, 2013 at 2:14 PM, Beavis <pfunix@gmail.com> wrote:
I go with 99.999% given that you have a good number of DNS Servers (anycasted).
On Thu, Sep 12, 2013 at 9:03 PM, Phil Fagan <philfagan@gmail.com> wrote:
Everything else remaining equal...is there a standard or expectation for DNS reliability?
98% 99% 99.5% 99.9% 99.99% 99.999%
Measured in queries completed vs. queries lost.
Whats the consensus?
-- Phil Fagan Denver, CO 970-480-7618
-- () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments
Disclaimer: http://goldmark.org/jeff/stupid-disclaimers/
-- Phil Fagan Denver, CO 970-480-7618
-- Glen Wiley KK4SFV "A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away." - Antoine de Saint-Exupery
Thumbs up on this one; my entire path and chain of management of that path need to be equally fault tolerant - Awesome. On Thu, Sep 12, 2013 at 2:40 PM, Glen Wiley <glen.wiley@gmail.com> wrote:
Remember though that anycast only solves for availability in one layer of the system and it is not difficult to create a less available anycast presence if you do silly things with the way you manage your routes. A system is only as available as the least available layer in that system
For example, if you use an automated system that changes your route advertisements and that system encounters a defect that breaks your announcements then although a well built anycast footprint might acheive 99.999, a poorly implemented management system that is less available and creates an outage would reduce the number.
On Thu, Sep 12, 2013 at 4:25 PM, Phil Fagan <philfagan@gmail.com> wrote:
Its a good point about the anycast; 99.999% should be expected.
On Thu, Sep 12, 2013 at 2:14 PM, Beavis <pfunix@gmail.com> wrote:
I go with 99.999% given that you have a good number of DNS Servers (anycasted).
On Thu, Sep 12, 2013 at 9:03 PM, Phil Fagan <philfagan@gmail.com> wrote:
Everything else remaining equal...is there a standard or expectation for DNS reliability?
98% 99% 99.5% 99.9% 99.99% 99.999%
Measured in queries completed vs. queries lost.
Whats the consensus?
-- Phil Fagan Denver, CO 970-480-7618
-- () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments
Disclaimer: http://goldmark.org/jeff/stupid-disclaimers/
-- Phil Fagan Denver, CO 970-480-7618
-- Glen Wiley KK4SFV
"A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away." - Antoine de Saint-Exupery
-- Phil Fagan Denver, CO 970-480-7618
On Thu, Sep 12, 2013 at 5:03 PM, Phil Fagan <philfagan@gmail.com> wrote:
Everything else remaining equal...is there a standard or expectation for DNS reliability?
98% 99% 99.5% 99.9% 99.99% 99.999%
Measured in queries completed vs. queries lost.
Whats the consensus?
ICANN new gTLD agreements specified 100% availability for the service, meaning at least 2 DNS IP addresses answered 95% of requests within 500 ms (UDP) or 1500 ms (TCP) for 51+% of the probes, or 99% availability for a single name server, defined as 1 DNS IP address. Rubens
Good reference; thank you. On Thu, Sep 12, 2013 at 2:39 PM, Rubens Kuhl <rubensk@gmail.com> wrote:
On Thu, Sep 12, 2013 at 5:03 PM, Phil Fagan <philfagan@gmail.com> wrote:
Everything else remaining equal...is there a standard or expectation for DNS reliability?
98% 99% 99.5% 99.9% 99.99% 99.999%
Measured in queries completed vs. queries lost.
Whats the consensus?
ICANN new gTLD agreements specified 100% availability for the service, meaning at least 2 DNS IP addresses answered 95% of requests within 500 ms (UDP) or 1500 ms (TCP) for 51+% of the probes, or 99% availability for a single name server, defined as 1 DNS IP address.
Rubens
-- Phil Fagan Denver, CO 970-480-7618
On 9/12/13 1:39 PM, Rubens Kuhl wrote:
ICANN new gTLD agreements specified 100% availability for the service, meaning at least 2 DNS IP addresses answered 95% of requests within 500 ms (UDP) or 1500 ms (TCP) for 51+% of the probes, or 99% availability for a single name server, defined as 1 DNS IP address.
unless phil happens to be building out (or spec'ing out $provider's offered sla) for one of the happy thousand or so celebrants of 2014, a surprisingly large fraction of which are tenant plays on existing infrastructure, the bogie above, uninterpreted, is not a controlling authority. additionally, was phil asking for a metric for an authoritative server, serving a zone delegated directly from the iana root? was he asking for a metric for a caching server? and if the metric is "queries completed vs. queries lost", from where to where? (that is the "uninterpreted" bit from the bogie rubens quotes, as we did have to correct some assumptions of the requirement author -- where is the measurement being preformed? i'm with randy on this, dns is a service, the better question is what fails as query response degrades, in the presence of hierarchical caching and the protocol being used as designed under best effort of infrastructure and application. eric
On Sep 12, 2013, at 2:35 PM, Randy Bush <randy@psg.com> wrote:
Everything else remaining equal...is there a standard or expectation for DNS reliability? ... Measured in queries completed vs. queries lost.
this is the wrong question. the protocol is designed assuming query failures.
randy
I think it's part of the right answer. Capacity and server connectivity issues, what this metric will mostly measure, do matter. The other part, more likely to get you on CNN and Reddit and the front pages of the NY Times and WSJ, is the area represented by MTBF / MTTR / etc. how often is DNS for your domain DOWN - or WRONG - and how fast did you recover. The other subthread about routeability plays into that. For BIGPLACE environments, you should be considering how many AS numbers independently host DNS instances for you, in how many geographical regions, and do you have a backup registrar available spun up... -george william herbert Sent from Kangphone
we're already outside our operating envelope, if these community expectation figures are believable. a wise man once said to me that when setting formal conformance targets its a good idea to only set ones you can honestly achieve, otherwise you're setting yourself up to be measured to fail. I don't think that necessarily competes with 'aim high' ('be all you can be') but... On Fri, Sep 13, 2013 at 8:26 AM, George William Herbert < george.herbert@gmail.com> wrote:
On Sep 12, 2013, at 2:35 PM, Randy Bush <randy@psg.com> wrote:
Everything else remaining equal...is there a standard or expectation for DNS reliability? ... Measured in queries completed vs. queries lost.
this is the wrong question. the protocol is designed assuming query failures.
randy
I think it's part of the right answer. Capacity and server connectivity issues, what this metric will mostly measure, do matter.
The other part, more likely to get you on CNN and Reddit and the front pages of the NY Times and WSJ, is the area represented by MTBF / MTTR / etc. how often is DNS for your domain DOWN - or WRONG - and how fast did you recover.
The other subthread about routeability plays into that. For BIGPLACE environments, you should be considering how many AS numbers independently host DNS instances for you, in how many geographical regions, and do you have a backup registrar available spun up...
-george william herbert
Sent from Kangphone
On Sep 12, 2013, at 3:39 PM, Randy Bush <randy@psg.com> wrote:
we're already outside our operating envelope
not really. just some folk seem not to understand things such as udp datagrams and the dns protocols.
randy
Statistically, UDP sometimes arrives after an internet wide round trip. Honest! The worry is bimodal. Most small sites, two or three servers, stop worrying. Most medium sites, watch your server load and run external monitoring. Most big sites are not sufficiently paranoid / redundant here. -george william herbert Sent from Kangphone
you removed a clause in that sentence randy: "we're already outside our operating envelope, if these community expectation figures are believable" there is a point to that clause. its the same as your answer in some respects. On Fri, Sep 13, 2013 at 8:39 AM, Randy Bush <randy@psg.com> wrote:
we're already outside our operating envelope
not really. just some folk seem not to understand things such as udp datagrams and the dns protocols.
randy
On Thu, Sep 12, 2013 at 6:26 PM, George William Herbert <george.herbert@gmail.com> wrote:
The other subthread about routeability plays into that. For BIGPLACE environments, you should be considering how many AS numbers independently host DNS instances for you, in how many geographical regions, and do you have a backup registrar available spun up...
here's an interesting point... if you are a BIGPLACE, do you want to trust your fate to some third party hosting your dns for you? What about how your internal name service stuff is managed? say you have a practice of using rsh to affect updates across your 4 main dns nodes, adding a 5th or Nth outside where rsh is not possible/desired .... means adding additional processes and cruft to your update process, is this acceptable? Take, for instance the FBI.gov domain 3 days ago, some set of updates happened, their ipv4 servers were answering with a consistent response, their ipv6 nodes were answering with a variety of not correct answers :( In the case of the FBI.gov domain, all of it is handled outside 'fbi.gov hands' (all servers hosted externally) but... -chris
On Thu, 12 Sep 2013 14:03:44 -0600, Phil Fagan said:
Everything else remaining equal...is there a standard or expectation for DNS reliability?
98% 99% 99.5% 99.9% 99.99% 99.999%
Measured in queries completed vs. queries lost.
Whats the consensus?
Remember to factor in Duane Wessel's work that showed that something like 98% of the DNS traffic at the root servers was totally bogus? Maybe you need to factor in "broken queries not answered, and offenders slapped around with a large trout"? Because if it's busted requests you're sending towards the root, they're going to count against your completed/lost ratio in a really bad way. Anybody know if people have cleaned up their collective acts since Duane did that paper?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 13/09/13 12:45, Valdis.Kletnieks@vt.edu wrote:
On Thu, 12 Sep 2013 14:03:44 -0600, Phil Fagan said:
Everything else remaining equal...is there a standard or expectation for DNS reliability?
98% 99% 99.5% 99.9% 99.99% 99.999%
Measured in queries completed vs. queries lost.
Whats the consensus?
Remember to factor in Duane Wessel's work that showed that something like 98% of the DNS traffic at the root servers was totally bogus?
Maybe you need to factor in "broken queries not answered, and offenders slapped around with a large trout"? Because if it's busted requests you're sending towards the root, they're going to count against your completed/lost ratio in a really bad way.
Anybody know if people have cleaned up their collective acts since Duane did that paper?
Wearing a different hat, I had the chance to rerun that analysis with data from 2008 (original paper is from 2003) and the number were still around 98% http://www.caida.org/publications/presentations/2008/wide_castro_root_server... Cheers, - -- Sebastian Castro DNS Specialist .nz Registry Services (New Zealand Domain Name Registry Limited) desk: +64 4 495 2337 mobile: +64 21 400535 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlI3bfYACgkQWyqRrHcQWTkagwCeOaShzFH1i8q9Y34/cybV6bUY qBYAn1A8JPgNJqH6mijUFN7+4ufybJqZ =X7UE -----END PGP SIGNATURE-----
participants (12)
-
Beavis
-
Bryan Tong
-
Christopher Morrow
-
Eric Brunner-Williams
-
George Michaelson
-
George William Herbert
-
Glen Wiley
-
Phil Fagan
-
Randy Bush
-
Rubens Kuhl
-
Sebastian Castro
-
Valdis.Kletnieks@vt.edu