On 11/30/2012 09:45 AM, Ray Soucy wrote:
I'll see your disagree and raise you another ;-)
I would say you almost never want to store addresses as character data unless the only thing you're using them for is logging (even then it's questionable). I run into people who do this all the time and it's a nightmare.
It's easy to store a v6 address as a string, but when you want to select a range of IPv6 addresses from a database, not having them represented as integers means you can't do efficient numerical comparisons in your SQL statements, it also makes indexing your table slower; to put it simply, it doesn't scale well.
So as a general rule, if you need to do any comparison or calculation on a v6 address, please don't store it as a string.
From an efficiency standpoint, you want to store it in chunks of the largest integer your DBMS supports. If a DBMS supports 128-bit integers and has optimized operations for them, then go for it. Most only support 64-, or even 32-bit. I say 64-bit because that's what the majority of current systems actually support and I don't see anyone coming out with a 128-bit architecture ;(
For convenience I would very much love to see MySQL include inet6_aton and inet6_ntoa, along with a 128-bit data structure that would be implemented as either a pair of 64-bit or 4x 32-bit values depending on the architecture. But from a performance standpoint, I really don't want my DBMS doing that calculation; I want the application server doing it (because it's much easier to scale and distribute the application side than the storage side). Postgresql has an inet data type that handles both ipv4 and ipv6 addresses with a slew of functions to manipulate the data type.
http://www.postgresql.org/docs/8.4/static/functions-net.html
Note that I'm talking about more from a database storage perspective than an internal application perspective.
By all means, you should use the standard data structure for v6. As mentioned below a lot of the internal structures use 8-bit unsigned integers (or char); but that's mainly a hold-over from when we had the reality of 8-bit and 16-bit platforms (for compatibility). With unions, these structs are treated as a collection of 8, 16, 32, 64 or a single 128-bit variable which makes it something the developer doesn't need to worry about once the libraries are written.
On Thu, Nov 29, 2012 at 9:55 AM, William Herrin <bill@herrin.us> wrote:
On Thu, Nov 29, 2012 at 9:01 AM, Ray Soucy <rps@maine.edu> wrote:
You should store IPv6 as a pair of 64-bit integers. While PHP lacks the function set to do this on its own, it's not very difficult to do. Hi Ray,
I have to disagree. In your SQL database you should store addresses as a fixed length character string containing a zero-padded hexadecimal representation of the IPv4 or IPv6 address with A through F forced to the consistent case of your choice. Expand :: and optionally strip the colons entirely. If you want to store a block of addresses, store it as two character strings: start and end of the range.
Bytes are cheap and query simplicity is important. Multi-element indexes are messy and the code to manage an array of integers is messier than managing a character string in most programming languages. memcmp() that integer array for less or greater than? Not on a little endian machine!
Here are a set of functions I wrote a while back to do just that (though I admit I should spend some time to try and make it more elegant and I'm not sure it's completely up to date compared to my local copy ... I would love some eyes on it to make some improvements).
http://soucy.org/project/inet6/ If we're plugging our code, give my public domain libeasyv6 a try. It eases entry into dual stack programming for anyone used to doing gethostbyname followed by a blocking connect(). Just do a connectbyname() with the hostname or textual IP address, the port, a timeout and null options. The library takes care of finding a working IPv4 or IPv6 address for the host and connecting to it in a timely manner.
http://bill.herrin.us/freebies/
Currently Linux only but if you're willing to lose timeout control on the DNS lookup you can replace getaddrinfo_a() with standard getaddrinfo() and the code should run anywhere.
Regards, Bill Herrin
-- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
-- Stephen Clark *NetWolves* Director of Technology Phone: 813-579-3200 Fax: 813-882-0209 Email: steve.clark@netwolves.com http://www.netwolves.com