Dennis Dayman wrote:
I have a customer having some DNS issues. They have done some research regarding some DNS timeout errors they saw with Verizon's sender verify looking up their MX records. What they have discovered is their current DNS service has a 1% failure/timeout rate. They are exploring other vendors (UltraDNS for one), but need an estimate of the number of DNS queries for accurate pricing to put together a ROI argument for the switch.
I had some problems with DNS timeout, and discovered that by doing priority queuing in my Cisco routers I was able to cut the failure rate to my authoritative DNS servers to near zero. The only time my DNS servers don't give a proper response is when a router is being flooded with other outbound data. Is your customer using BIND? What do the statistics tell you? How many DNS servers are handling the traffic? Are they load-balanced? Has the DNS servers been upgraded to handle more traffic? Does the customer segregate their authoritative servers from their recursive ones? (That one change right there improved my DNS reliability and servicability by several orders of magnitude!) From your description, I'd say there was a lot more work to be done first, unless they just don't have the people to do it right.