On 03/01/2012 06:26 AM, William Herrin wrote:
The simpler approach and perfectly viable without mucking up what is already implemented and working:
Don't keep returns from GAI/GNI around longer than it takes to cycle through your connect() loop immediately after the GAI/GNI call. The even simpler approach: create an AF_NAME with a sockaddr struct
On Thu, Mar 1, 2012 at 7:20 AM, Owen DeLong<owen@delong.com> wrote: that contains a hostname instead of an IPvX address. Then let connect() figure out the details of caching, TTLs, protocol and address selection, etc. Such a connect() could even support a revised TCP stack which is able to retry with the other addresses at the first subsecond timeout rather than camping on each address in sequence for the typical system default of two minutes.
The effect of what you're recommending is to move all of this into the kernel, and in the process greatly expand its scope. Also: even if you did this, you'd be saddled with the same problem because nothing existing would use an AF_NAME.
The real issue is that gethostbyxxx has been inadequate for a very long time. Moving it across the kernel boundary solves nothing and most likely causes even more trouble: what if I want, say, asynchronous name resolution? What if I want to use SRV records? What if a new DNS RR comes around -- do i have do recompile the kernel? It's for these reasons and probably a whole lot more that connect just confuses the actual issues.
When I was writing the first version of DKIM I used a library that I scraped off the net called ARES. It worked adequately for me, but the most notable thing was the very fact that I had to scrape it off the net at all. As far as I could tell, standard distos don't have libraries with lower level access to DNS (in my case, it needed to not block). Before positing a super-deluxe gethostbyxx that does addresses picking, etc, etc, it would be better to lobby all of the distos to settle on a decomposed resolver library from which that and more could be built.
It's deeper than just that, though. The whole paradigm is messy, from the point of view of someone who just wants to get stuff done. The examples are (almost?) all fatally flawed. The code that actually gets at least some of it right ends up being too complex and too hard for people to understand why things are done the way they are. Even in the "old days", before IPv6, geez, look at this: bcopy(host->h_addr_list[n], (char *)&addr->sin_addr.s_addr, sizeof(addr->sin_addr.s_addr)); That's real comprehensible - and it's essentially the data interface between the resolver library and the system's addressing structures for syscalls. On one hand, it's "great" that they wanted to abstract the dirty details of DNS away from users, but I'd say they failed pretty much even at that. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.