(here we are discussing dns protocol details on nanog@ again. must be sunday.)
From: Joe Abley <jabley@ca.afilias.info>
It may be worth clarifying that "not considering TCP mandatory" above is an implementation/operational choice, and not something that seems to be clearly endorsed by RFC 1035, such as it is.
There are a lot of people who insist that TCP transport is used for nothing other than zone transfers in the DNS, and they do so not out of concern over potential TCP state explosion on their servers but instead because "that's what the last guy told me". That kind of reasoning doesn't need a bigger posse.
Joe
4.2. Transport ...
actually, it does (need a bigger posse). a little further on in RFC 1035 we find this gem: +--- | 4.2.2. TCP usage | ... | Several connection management policies are recommended: | | - The server should not block other activities waiting for TCP | data. | | - The server should support multiple connections. | | - The server should assume that the client will initiate | connection closing, and should delay closing its end of the | connection until all outstanding client requests have been | satisfied. | | - If the server needs to close a dormant connection to reclaim | resources, it should wait until the connection has been idle | for a period on the order of two minutes. In particular, the | server should allow the SOA and AXFR request sequence (which | begins a refresh operation) to be made on a single connection. | Since the server would be unable to answer queries anyway, a | unilateral close or reset may be used instead of a graceful | close. +--- in the era of RFC 1035 the philosophy was "be liberal in what you accept and be conservative in what you generate" because the other people on the network weren't trying to spam, ddos, or poison you. the above text is effectively a "please ddos me" bumper sticker worn across the ass of anyone who implements it. we're having a terrible time with UDP now simply because upstream sockets can now only be used once before they're closed, and any long-running query can tie up a file descriptor for a longish enough time to drain the pool down to the point where new downstream queries can't be accepted because existing upstream queries have not completed. file descriptors are the new "carbon footprint" of DNS. but at least in the UDP case the shortage is experienced only by the initiator. in the TCP case the shortage is experienced by both the initiator and the responder, and the responder is shackled by [4.2.2]. i don't want to completely dismiss the underlying idea of "fall back to TCP if someone guesses a few QID's wrong". however, i dismiss the idea that it's a simple universal solution. on a mailing list called namedroppers@ where i would more likely expect to see a discussion of fine points of DNS protocol, i spake thusly about a week ago, and so far have heard no reply, though i've implemented these recommendations on a server that only keeps 64 descriptors open at a time and it's been incredibly resistant to my poisoning attempts. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.