TL;DR I suspect there are middle boxes that don't like IPs ending in .255. Anyone seen that? Folks, We are troubleshooting a strange issue where some of our customers cannot establish a successful connection with our HTTP front end. In addition to checking the usual things like routing and interface errors and security policy configurations, hopening support tickets with the load balancer vendor so far all to no avail, we did packet captures. Based on the packet captures we receive a SYN, we reply with SYN-ACK, but the client never actually receives that SYN-ACK. In a different instance the 3-way completes, followed by TLS client hello to us, we reply with TLS Server Hello and that server hello never makes it to the client. And again, this is only affecting a small subset of customers thus suggesting it's not the load balancer or the edge routing configuration (in fact we can traceroute fine to the customer's IP). So far the only remaining theory that remains is that there are middle boxes out there that do not like IPs ending in .255. The service that the clients can't get to is hosted on two IPs ending in .255 Let's just say they are x.x.121.255 and x.x.125.255. We even stood up a basic "hello world" web server on x.x.124.255 with the same result. Standing up the very same basic webserver on x.x.124.250 allows the client to succeed. So far we have a friendly customer who has been working with us on troubleshooting the issue and we have some pcaps from the client's side somewhat confirming that it's not the customer's system either. This friendly customer is in a small 5 people office with Spectrum business internet (that's the SYN-ACK case). The same customer tried hopping on his LTE hotspot which came up as Cellco Partnership DBA Verizon Wireless with the same result (that's the TLS server hello case). That same customer with the same workstation drives a town over and he can get to the application fine (we are still waiting for the customer to let us know what that source IP is when it does work). Before you suggest that those .255 addresses are broadcasts on some VLAN, they are not. They are injected as /32s using a routing protocol, while the VLAN addressing is all RFC1918 addressing. --Andrey
On Mon, Sep 14, 2020 at 5:28 PM Andrey Khomyakov <khomyakov.andrey@gmail.com> wrote:
TL;DR I suspect there are middle boxes that don't like IPs ending in .255. Anyone seen that?
Windows XP/Windows 2003 both had an issue where addresses ending in .255 wouldn't work, regardless of the mask. It seems unlikely that there are middleboxes that are still that old kicking around (largely because they would likely have been 0wned and tossed out), but... There used to be a knowledge base article on this - http://support.microsoft.com/kb/281579 according to my bookmarks, but it has disappeared... W
Folks, We are troubleshooting a strange issue where some of our customers cannot establish a successful connection with our HTTP front end. In addition to checking the usual things like routing and interface errors and security policy configurations, hopening support tickets with the load balancer vendor so far all to no avail, we did packet captures. Based on the packet captures we receive a SYN, we reply with SYN-ACK, but the client never actually receives that SYN-ACK. In a different instance the 3-way completes, followed by TLS client hello to us, we reply with TLS Server Hello and that server hello never makes it to the client. And again, this is only affecting a small subset of customers thus suggesting it's not the load balancer or the edge routing configuration (in fact we can traceroute fine to the customer's IP). So far the only remaining theory that remains is that there are middle boxes out there that do not like IPs ending in .255. The service that the clients can't get to is hosted on two IPs ending in .255 Let's just say they are x.x.121.255 and x.x.125.255. We even stood up a basic "hello world" web server on x.x.124.255 with the same result. Standing up the very same basic webserver on x.x.124.250 allows the client to succeed. So far we have a friendly customer who has been working with us on troubleshooting the issue and we have some pcaps from the client's side somewhat confirming that it's not the customer's system either. This friendly customer is in a small 5 people office with Spectrum business internet (that's the SYN-ACK case). The same customer tried hopping on his LTE hotspot which came up as Cellco Partnership DBA Verizon Wireless with the same result (that's the TLS server hello case). That same customer with the same workstation drives a town over and he can get to the application fine (we are still waiting for the customer to let us know what that source IP is when it does work). Before you suggest that those .255 addresses are broadcasts on some VLAN, they are not. They are injected as /32s using a routing protocol, while the VLAN addressing is all RFC1918 addressing.
--Andrey
-- I don't think the execution is relevant when it was obviously a bad idea in the first place. This is like putting rabid weasels in your pants, and later expressing regret at having chosen those particular rabid weasels and that pair of pants. ---maf
On 14/09/2020 22:25, Andrey Khomyakov wrote:
TL;DR I suspect there are middle boxes that don't like IPs ending in .255. Anyone seen that?
Yes, but not for many, MANY years. I would expect that this service might not like addresses ending in .0 either? It was ca. 2010, when I started receiving an increasing number of complaints that connections from addresses ending in .0 or .255 were failing toward my (at the time) hosted services. This behaviour was eventually* narrowed to iptables rules carelessly included with 'Atomic Secured Linux' that purposely blackholed connections if the source address' most specific octet happened to contain .0 or .255. I'm sure that 'ASL' wasn't the only piece of software to have shipped with this default behaviour, so should you discover any box of any sort, configuration (or age) blindly hampering the connectivity for addresses with all-1s or all-0s in any of the three most-specific octets, please take this as infallible permission to promptly introduce it to the nearest body of water. :) * I still have AAISP - my home ISP at the time - to thank for routing me a /30 with a .255 address in it! It wouldn't have been as easy to resolve without that - very few UK consumers were being assigned addresses with .255 in them at the time. -- Tom
Peacez On Tue, Sep 15, 2020, 12:26 AM Andrey Khomyakov <khomyakov.andrey@gmail.com> wrote:
TL;DR I suspect there are middle boxes that don't like IPs ending in .255. Anyone seen that?
Also .0 and .1. Yes, there was some kind of a strange behavior with those addresses before. We excluded those from rotation back in 2011 when that was really biting us. There's an impression that this issue has become much less troubling over the years, didn't have time to investigate though. -- Töma
On Tue, Sep 15, 2020 at 8:26 AM Töma Gavrichenkov <ximaera@gmail.com> wrote:
Also .0 and .1.
Yes, there was some kind of a strange behavior with those addresses before. We excluded those from rotation back in 2011 when that was really biting us. There's an impression that this issue has become much less troubling over the years, didn't have time to investigate though.
Yep, I once had a customer (circa 2013–2014) who couldn't load https://www.stgeorge.com.au/ because they (a PPP–based user, where addressing is point to point, effectively /32 each end if you like) had an IP address ending in .0, despite it being in the middle of an otherwise larger pool. Some middlebox forming opinions about an address it has no business forming an opinion about.
You may want to do traceroute using syn/ack packets to find the offending piece of equipment (may require modifying traceroute to set the syn and ack).
On 15 Sep 2020, at 07:25, Andrey Khomyakov <khomyakov.andrey@gmail.com> wrote:
TL;DR I suspect there are middle boxes that don't like IPs ending in .255. Anyone seen that?
Folks, We are troubleshooting a strange issue where some of our customers cannot establish a successful connection with our HTTP front end. In addition to checking the usual things like routing and interface errors and security policy configurations, hopening support tickets with the load balancer vendor so far all to no avail, we did packet captures. Based on the packet captures we receive a SYN, we reply with SYN-ACK, but the client never actually receives that SYN-ACK. In a different instance the 3-way completes, followed by TLS client hello to us, we reply with TLS Server Hello and that server hello never makes it to the client. And again, this is only affecting a small subset of customers thus suggesting it's not the load balancer or the edge routing configuration (in fact we can traceroute fine to the customer's IP). So far the only remaining theory that remains is that there are middle boxes out there that do not like IPs ending in .255. The service that the clients can't get to is hosted on two IPs ending in .255 Let's just say they are x.x.121.255 and x.x.125.255. We even stood up a basic "hello world" web server on x.x.124.255 with the same result. Standing up the very same basic webserver on x.x.124.250 allows the client to succeed. So far we have a friendly customer who has been working with us on troubleshooting the issue and we have some pcaps from the client's side somewhat confirming that it's not the customer's system either. This friendly customer is in a small 5 people office with Spectrum business internet (that's the SYN-ACK case). The same customer tried hopping on his LTE hotspot which came up as Cellco Partnership DBA Verizon Wireless with the same result (that's the TLS server hello case). That same customer with the same workstation drives a town over and he can get to the application fine (we are still waiting for the customer to let us know what that source IP is when it does work). Before you suggest that those .255 addresses are broadcasts on some VLAN, they are not. They are injected as /32s using a routing protocol, while the VLAN addressing is all RFC1918 addressing.
--Andrey
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On 9/14/20 2:25 PM, Andrey Khomyakov wrote:
TL;DR I suspect there are middle boxes that don't like IPs ending in .255. Anyone seen that?
Yes. We'd every so often get random complaints that "my friend can't reach my website but I can", etc., with not enough detail to track it down. The problem would disappear when we moved it to another IP address. Because of this, we stopped allocating customer websites on .0 and .255 IP addresses about 10 years ago, instead using them for internal / controlled access purposes where we could investigate any problems. (Which never occur. <shrug>) -- Robert L Mathews, Tiger Technologies, http://www.tigertech.net/
On 9/14/20 2:25 PM, Andrey Khomyakov wrote:
TL;DR I suspect there are middle boxes that don't like IPs ending in .255. Anyone seen that?
Yes. We'd every so often get random complaints that "my friend can't reach my website but I can", etc., with not enough detail to track it down. The problem would disappear when we moved it to another IP address.
Because of this, we stopped allocating customer websites on .0 and .255 IP addresses about 10 years ago, instead using them for internal / controlled access purposes where we could investigate any problems. (Which never occur. <shrug>)
We have started using .0 and .255 again in the past two years more or less. here is what one NAS shows 26 .255 users and 21 .0 users asr1006-jn1#sh user | count \.255$ Number of lines which match regexp = 26 asr1006-jn1#sh user | count \.0$ Number of lines which match regexp = 21 We do occasionally have to change an IP but it is rare and for the most part things just work. This is much different to 10 years ago where it was impossible to use them and we needed to exclude them from our pools. A plus, it is kind of fun when a super consultant calls and says he can't use a broadcast/network address for nat or a vpn endpoint. Brian
You could have them try the AWS E2 reachability site to confirm if this is the case. https://ec2-reachability.amazonaws.com/ Many of their test nodes end with .255 or .0. There are a few ending with 255.255 and several that end with 0.0. I’m not sure what the website test actually does (ICMP versus TCP test or something else), but you can also connect to those IPs (at least the two that I just tested) over port 80, to test the full handshake. You mentioned ClientHello/ServerHello, these nodes don't respond over port 443 (only saw SYN). Kinda makes sense given they're IP addresses. -joe From: NANOG <nanog-bounces+joe.klein=mischoice.com@nanog.org> On Behalf Of Andrey Khomyakov Sent: Monday, September 14, 2020 16:26 To: Nanog <nanog@nanog.org> Subject: IP addresses on subnet edge (/24) External Mail TL;DR I suspect there are middle boxes that don't like IPs ending in .255. Anyone seen that? Folks, We are troubleshooting a strange issue where some of our customers cannot establish a successful connection with our HTTP front end. In addition to checking the usual things like routing and interface errors and security policy configurations, hopening support tickets with the load balancer vendor so far all to no avail, we did packet captures. Based on the packet captures we receive a SYN, we reply with SYN-ACK, but the client never actually receives that SYN-ACK. In a different instance the 3-way completes, followed by TLS client hello to us, we reply with TLS Server Hello and that server hello never makes it to the client. And again, this is only affecting a small subset of customers thus suggesting it's not the load balancer or the edge routing configuration (in fact we can traceroute fine to the customer's IP). So far the only remaining theory that remains is that there are middle boxes out there that do not like IPs ending in .255. The service that the clients can't get to is hosted on two IPs ending in .255 Let's just say they are x.x.121.255 and x.x.125.255. We even stood up a basic "hello world" web server on x.x.124.255 with the same result. Standing up the very same basic webserver on x.x.124.250 allows the client to succeed. So far we have a friendly customer who has been working with us on troubleshooting the issue and we have some pcaps from the client's side somewhat confirming that it's not the customer's system either. This friendly customer is in a small 5 people office with Spectrum business internet (that's the SYN-ACK case). The same customer tried hopping on his LTE hotspot which came up as Cellco Partnership DBA Verizon Wireless with the same result (that's the TLS server hello case). That same customer with the same workstation drives a town over and he can get to the application fine (we are still waiting for the customer to let us know what that source IP is when it does work). Before you suggest that those .255 addresses are broadcasts on some VLAN, they are not. They are injected as /32s using a routing protocol, while the VLAN addressing is all RFC1918 addressing. --Andrey
participants (9)
-
Andrey Khomyakov
-
Brian Turnbow
-
Jeremy Visser
-
Joe Klein
-
Mark Andrews
-
Robert L Mathews
-
Tom Hill
-
Töma Gavrichenkov
-
Warren Kumari