Regular Expression for IPv6 addresses
Folks, My company, Dartware, have derived a regex for testing whether an IPv6 address is correct. I've posted it in my blog: http://intermapper.ning.com/profiles/blogs/a-regular-expression-for-ipv6 This has links to the regular expression, a (Perl) program that tests various correct and malformed addresses, and a Ruby implementation of the same. Hope it's useful. Rich Brown richard.e.brown@dartware.com Dartware, LLC http://www.dartware.com 66-7 Benning Street Telephone: 603-643-9600 West Lebanon, NH 03784-3407 Fax: 603-643-2289
Richard E. Brown wrote:
Folks,
My company, Dartware, have derived a regex for testing whether an IPv6 address is correct. I've posted it in my blog:
http://intermapper.ning.com/profiles/blogs/a-regular-expression-for-ipv6
This has links to the regular expression, a (Perl) program that tests various correct and malformed addresses, and a Ruby implementation of the same.
You know, link local addresses (fe80::/10) are quite useless without specifying the zone of that address. See section 11 of RFC4007. The only proper way of "testing" if an address is a valid IPv6 address is to feed it to getaddrinfo() and then use it through that API. Yes, you can make some assumptions, but it has shown that people assuming that everything stayed under 2001::/16 also got it wrong at one point in time. Thus just feed it to getaddrinfo() if you really need it. Greets, Jeroen
In message <4B6B66FF.50108@spaghetti.zurich.ibm.com>, Jeroen Massar writes:
Richard E. Brown wrote:
Folks, =20 My company, Dartware, have derived a regex for testing whether an IPv6 address is correct. I've posted it in my blog: =20 http://intermapper.ning.com/profiles/blogs/a-regular-expression-for= -ipv6 =20 =20 This has links to the regular expression, a (Perl) program that tests various correct and malformed addresses, and a Ruby implementation of the same.
You know, link local addresses (fe80::/10) are quite useless without specifying the zone of that address. See section 11 of RFC4007.
The only proper way of "testing" if an address is a valid IPv6 address is to feed it to getaddrinfo() and then use it through that API. Yes, you can make some assumptions, but it has shown that people assuming that everything stayed under 2001::/16 also got it wrong at one point in time. Thus just feed it to getaddrinfo() if you really need it.
Greets, Jeroen
And now for the trick question. Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077? Mark -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
Mark Andrews wrote: [..]
And now for the trick question. Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077?
::ffff:0:0:0:0/96 should never ever be shown to a user, as it is confusing (is it IPv6 or IPv4?) and does not make sense at all. As such whatever one thinks of it, it is "illegal" in that context. Internally inside a program though using a 128bit sequence of memory is of course a great way to store both IPv6 and IPv4 addresses in one structure and that is where the ::ffff:0:0:0:0::/96 format is very useful and intended for. Of course still the representation to the user of addresses stored that way would be 77.77.77.77 (and thus an IPv4 address and not IPv6) even though internally it is written as an IPv6 address. As that usage is internal, you don't need any validation of the format as the input will be either an IPv6 or IPv4 address without any of the compatibility stuff, thus one does not need to handle it anyway. Of course, there should be only limited places where a user can enter or see IP addresses in the first place. There is this great thing called DNS which is what most people should be using. Greets, Jeroen
In message <4B6B7185.2080708@spaghetti.zurich.ibm.com>, Jeroen Massar writes:
Mark Andrews wrote: [..]
And now for the trick question. Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077?
::ffff:0:0:0:0/96 should never ever be shown to a user, as it is confusing (is it IPv6 or IPv4?) and does not make sense at all. As such whatever one thinks of it, it is "illegal" in that context.
Internally inside a program though using a 128bit sequence of memory is of course a great way to store both IPv6 and IPv4 addresses in one structure and that is where the ::ffff:0:0:0:0::/96 format is very useful and intended for. Of course still the representation to the user of addresses stored that way would be 77.77.77.77 (and thus an IPv4 address and not IPv6) even though internally it is written as an IPv6 address.
You missed the point 077 is octal and 077.077.077.077 is 63.63.63.63 in the IPv4 address whereas it is decimal dotted quad in a mapped address *if* zero padded decimal dotted quad is legal in a IPv6 text form.
As that usage is internal, you don't need any validation of the format as the input will be either an IPv6 or IPv4 address without any of the compatibility stuff, thus one does not need to handle it anyway.
Of course, there should be only limited places where a user can enter or see IP addresses in the first place. There is this great thing called DNS which is what most people should be using.
Greets, Jeroen
--------------enig57675C04A65E0982D8079586 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc"
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (MingW32)
iEYEARECAAYFAktrcYgACgkQKaooUjM+fCPUCQCgmwJ8u2Zqi1ljQ+PVOByv45Jv OrgAn2iTiqdLdFWT5a9vlM6dUe6McqEO =OqJc -----END PGP SIGNATURE-----
--------------enig57675C04A65E0982D8079586-- -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
And now for the trick question. Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077?
::ffff:0:0:0:0/96 should never ever be shown to a user, as it is confusing (is it IPv6 or IPv4?) and does not make sense at all. As such whatever one thinks of it, it is "illegal" in that context.
Define "user"? Both Cisco and Juniper use these addresses for IPv6 L3VPNs, and the addresses are definitely visible. Cisco and Juniper examples: B 2001:abcd:60:3::/64 [200/0] via ::ffff:172.16.101.204 (nexthop in vrf default), 4d10h B 2001:abcd:60:4::/64 [200/0] via ::ffff:172.16.101.205 (nexthop in vrf default), 4d10h B 2001:abcd:60:7::/64 [200/0] via ::ffff:172.16.1.7 (nexthop in vrf default), 6d13h ::ffff:172.16.1.1/128 *[LDP/6] 4d 11:01:30, metric 1 > to 172.16.102.201 via ge-0/3/0.0, Push 313008 ::ffff:172.16.1.2/128 *[LDP/6] 1w0d 20:27:12, metric 1 > to 172.16.102.201 via ge-0/3/0.0, Push 312240 ::ffff:172.16.1.3/128 *[LDP/6] 4d 11:01:30, metric 1 > to 172.16.102.201 via ge-0/3/0.0, Push 313024 Steinar Haug, Nethelp consulting, sthaug@nethelp.no
On Fri, Feb 5, 2010 at 12:15 AM, <sthaug@nethelp.no> wrote:
And now for the trick question. Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077?
Wasn't there an internet draft on that subject, recently? http://tools.ietf.org/html/draft-ietf-6man-text-addr-representation-04 077.077.077.077 is equivalent to 77.77.77.77 if valid at all RFC 4038 is very clear that the text representation of a mapped IPv4 address is Base 10. http://tools.ietf.org/html/rfc4038#section-5.1 This is a bit like asking if "::ffff:10.1.2" is a valid IP address though. And is it the same as the ip address "10.1.2" ? (Which of course expands to 10.1.0.2, on common implementations of inet_pton, inet_aton, and getaddrinfo) Or ::ffff:0xA010002 I would say these are perfectly valid _shorthands_ and abbreviations for entering an IP address, which may be provided by some systems, but that they are non-canonical text representations for displaying publishing or sharing IP addresses. -- -J
In message <6eb799ab1002061452s51f9cf61p303d36130291301@mail.gmail.com>, James Hess writes:
On Fri, Feb 5, 2010 at 12:15 AM, <sthaug@nethelp.no> wrote:
And now for the trick question. =A0Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077?
Wasn't there an internet draft on that subject, recently? http://tools.ietf.org/html/draft-ietf-6man-text-addr-representation-04
077.077.077.077 is equivalent to 77.77.77.77 if valid at all RFC 4038 is very clear that the text representation of a mapped IPv4 address is Base 10. http://tools.ietf.org/html/rfc4038#section-5.1
But 077.077.077.077 is octal dotted quad. Decimal dotted quad does *not* have leading zeros. The point of allowing for dotted quad is to allow for easy mapping between IPv4 representation and IPv6 with encoded IPv4 representations. Accepting a octal representation as decimal is a bad thing and leads to none obvious failures. % ping 077.077.077.077 PING 077.077.077.077 (63.63.63.63): 56 data bytes ^C --- 077.077.077.077 ping statistics --- 4 packets transmitted, 0 packets received, 100% packet loss % "ping ::ffff:077.077.077.077" would not get to same box if my ping accepted that as a address literal which luckily it doesn't.
This is a bit like asking if "::ffff:10.1.2" is a valid IP address though.
Except it clearly isn't as there are not 4 components.
And is it the same as the ip address "10.1.2" ?
(Which of course expands to 10.1.0.2, on common implementations of inet_pton, inet_aton, and getaddrinfo) Or ::ffff:0xA010002
inet_pton() did not accept 10.1.2 when it was originally written. This was a *deliberate* decision. Some vendors have changed it to accept it but they are wrong. I can say that because I was involved in making that decision.
I would say these are perfectly valid _shorthands_ and abbreviations for entering an IP address, which may be provided by some systems, but that they are non-canonical text representations for displaying publishing or sharing IP addresses.
-- -J
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
I Just Don't Know What To Do With Myself ----- Original Message ---- From: Jeroen Massar <jeroen@unfix.org> To: Mark Andrews <marka@isc.org> Cc: nanog@nanog.org; Richard E. Brown <Richard.E.Brown@dartware.com> Sent: Fri, February 5, 2010 1:16:53 AM Subject: Re: Regular Expression for IPv6 addresses Mark Andrews wrote: [..]
And now for the trick question. Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077?
::ffff:0:0:0:0/96 should never ever be shown to a user, as it is confusing (is it IPv6 or IPv4?) and does not make sense at all. As such whatever one thinks of it, it is "illegal" in that context. Internally inside a program though using a 128bit sequence of memory is of course a great way to store both IPv6 and IPv4 addresses in one structure and that is where the ::ffff:0:0:0:0::/96 format is very useful and intended for. Of course still the representation to the user of addresses stored that way would be 77.77.77.77 (and thus an IPv4 address and not IPv6) even though internally it is written as an IPv6 address. As that usage is internal, you don't need any validation of the format as the input will be either an IPv6 or IPv4 address without any of the compatibility stuff, thus one does not need to handle it anyway. Of course, there should be only limited places where a user can enter or see IP addresses in the first place. There is this great thing called DNS which is what most people should be using. Greets, Jeroen
On Fri, 5 Feb 2010, Mark Andrews wrote:
And now for the trick question. Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077?
Forget IPv6. The first question is does 077.077.077.077 match 077.077.077.077 in IPv4? The answer is a long one full of different answers depending on who's doing the parsing (gethostbyname(), inet_aton(), inet_net_pton(), etc..) and on what OS. And also on many bugs. And don't count on the documentation being right either, or parsers respecting standards (single unix or RFCs, or which one when they conflict). And don't expect an error code if you feed 080.080.080.080 into a parser, even one that *does* read it as octal. Don't prefix IP (v4) address octets with zero wether you expect it to be treated as octal or not. Just don't. World of hurt and all that. E.g.: http://kerneltrap.org/mailarchive/openbsd-bugs/2009/6/6/5882713/thread We should all do like one vendor I've seen where you enter the IP (v4) address in binary... and then pad with zeroes to whatever size html form wanted. Yes, this decade. --------- typedef struct me_s { char name[] = { "Thomas Habets" }; char email[] = { "thomas@habets.pp.se" }; char kernel[] = { "Linux" }; char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" }; char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" }; char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" }; } me_t;
In message <alpine.DEB.1.10.1002091548170.25663@red.crap.retrofitta.se>, Thomas Habets writes:
On Fri, 5 Feb 2010, Mark Andrews wrote:
And now for the trick question. Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077?
Forget IPv6. The first question is does 077.077.077.077 match 077.077.077.077 in IPv4?
I think you meant "does 077.077.077.077 match 77.77.77.77 in IPv4".
The answer is a long one full of different answers depending on who's doing the parsing (gethostbyname(), inet_aton(), inet_net_pton(), etc..) and on what OS. And also on many bugs.
Indeed. It's a minefield out there for application developers that want consistancy. Even when you develop your own some OS vendor will go and stuff it up on you.
And don't count on the documentation being right either, or parsers respecting standards (single unix or RFCs, or which one when they conflict). And don't expect an error code if you feed 080.080.080.080 into a parser, even one that *does* read it as octal.
Don't prefix IP (v4) address octets with zero wether you expect it to be treated as octal or not. Just don't. World of hurt and all that.
E.g.: http://kerneltrap.org/mailarchive/openbsd-bugs/2009/6/6/5882713/thread
We should all do like one vendor I've seen where you enter the IP (v4) address in binary... and then pad with zeroes to whatever size html form wanted. Yes, this decade.
--------- typedef struct me_s { char name[] = { "Thomas Habets" }; char email[] = { "thomas@habets.pp.se" }; char kernel[] = { "Linux" }; char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" }; char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" }; char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" }; } me_t; -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On Wed, 10 Feb 2010 09:12:11 +1100, Mark Andrews said:
In message <alpine.DEB.1.10.1002091548170.25663@red.crap.retrofitta.se>, Thomas Habets writes:
On Fri, 5 Feb 2010, Mark Andrews wrote:
And now for the trick question. Is ::ffff:077.077.077.077 a legal mapped address and if it, does it match 077.077.077.077?
Forget IPv6. The first question is does 077.077.077.077 match 077.077.077.077 in IPv4?
I think you meant "does 077.077.077.077 match 77.77.77.77 in IPv4".
No, he had it right, because...
The answer is a long one full of different answers depending on who's doing the parsing (gethostbyname(), inet_aton(), inet_net_pton(), etc..) and on what OS. And also on many bugs.
Indeed. It's a minefield out there for application developers that want consistancy. Even when you develop your own some OS vendor will go and stuff it up on you.
There's no guarantee that 2 different binaries on the same box will resolve 077.077.077.077 to the same 32-bit sequence, so it's in fact possible that it's not even equal to itself, much less 77.77.77.77.
On 2010-02-04 at 17:50 -0500, Richard E. Brown wrote:
My company, Dartware, have derived a regex for testing whether an IPv6 address is correct. I've posted it in my blog:
http://intermapper.ning.com/profiles/blogs/a-regular-expression-for-ipv6
This has links to the regular expression, a (Perl) program that tests various correct and malformed addresses, and a Ruby implementation of the same.
There's a full grammar in RFC 3986 (URI Generic Syntax) already, which can be translated straight. It too handles the embedded IPv4 addresses. While your code is written in a more condensed manner, those who want to be able to cross-check against the RFC might want to take a look at this one, which emits a PCRE regexp: http://people.spodhuis.org/phil.pennock/software/emit_ipv6_regexp-0.304 http://people.spodhuis.org/phil.pennock/software/emit_ipv6_regexp-0.304.asc (Version numbers for repository, not for that one script :) ). FWIW, the ability to grab a shell variable which contains an RE for IPv6 addresses, which can be used in: pcregrep "$ipv6_regex" log_file has proven very useful, especially when debugging newly-added IPv6 support for an app. This is also the most coherent justification I've come up with so far for using a regexp instead of a dedicated parser, other than "because I could". Regards, -Phil
participants (9)
-
isabel dias
-
James Hess
-
Jeroen Massar
-
Mark Andrews
-
Phil Pennock
-
Richard.E.Brown@dartware.com
-
sthaug@nethelp.no
-
Thomas Habets
-
Valdis.Kletnieks@vt.edu