Telephone call gapping by the major long distance carriers into the region seemed to be in effect for a while. I don't believe this is one of the five critical Mississippi River fiber crossing points, so Internet traffic appears mostly unaffected.
For four months dozens of our users who are Comcast subscribers have had difficulty reaching St. Olaf College's and Carleton College's network services. We have worked through everything we can think of with our Onvoy (regional ISP) network engineers. We have isolated the problem a couple of Comcast's IP subnets, but need a contact within Comcast to further troubleshoot. The behavior in a nutshell: -- User A on Comcast Subnet B browses to www.stolaf.edu (http or https, other web sites on-site and @carleton.edu behave the same). Our access_log shows an initial "GET /" of our homepage, then very slow (if any) subsequent requests (for our stylesheet or homepage images). Ping's look fine; traceroute's look as reasonable. Telnet's to port 80 and other services do seem to respond, albeit very slowly. User A has the same problem with access @carleton.edu but can access everything else (including other Onvoy customers) without any trouble whatsoever. If User A then removes his Linksys router and connects his computer directly to the cable modem, he acquires an IP address in Comcast Subnet C. Then, everything works fine, including access to www.stolaf.edu and www.carleton.edu. He puts the Linksys router back in (which still has the IP address in Comcast Subnet B), and the problem returns. The problem IP subnets are completely consistent. Known WORKING IP Subnets: 75.72.0.0, 24.x Known NON-WORKING IP Subnets: 71.x, 73.x -- We have already attempted the usual troubleshooting and have eliminated user problems, computer problems, server problems, cable modem problems, and Linksys router problems. Traceroutes have been somewhat inconclusive since Onvoy blocks ICMP within its network. So, why just St. Olaf and Carleton services? We are on a shared physical link from Onvoy, though on different VLANs. Onvoy has verified everything they can (routing, packet loss, etc.) between them and us, and I'm not sure what additional questions I can ask of them to test. Suggestions? Maybe Comcast has a broken transparent proxy on part(s) of their network? But they have told us they have nothing like this anywhere on their network. Maybe there is some asymmetric routing somewhere, though all the investigation there has come up empty. A third possibility is some kind of packet loss, but there is little if any evidence of that. So, we are really at a loss and seek any suggestions you all might have. And a contact in Comcast network engineering would be especially useful to continue our troubleshooting. With thanks, Craig -- Craig D. Rice Associate Director of Information Systems cdr@stolaf.edu Information and Instructional Technologies +1 507 786-3631 St. Olaf College +1 507 786-3096 FAX 1510 St. Olaf Avenue http://www.stolaf.edu/people/cdr Northfield, MN 55057-1097 USA
I've forwarded your message to the appropriate team within Comcast. - Alain.
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Craig D. Rice Sent: Thursday, August 02, 2007 9:30 AM To: nanog@merit.edu Subject: Seeking Comcast Contact: need to troubleshoot packet loss and/or asymmetric routing issue between Comcast & Onvoy
For four months dozens of our users who are Comcast subscribers have had difficulty reaching St. Olaf College's and Carleton College's network services.
We have worked through everything we can think of with our Onvoy (regional ISP) network engineers. We have isolated the problem a couple of Comcast's IP subnets, but need a contact within Comcast to further troubleshoot.
The behavior in a nutshell:
--
User A on Comcast Subnet B browses to www.stolaf.edu (http or https, other web sites on-site and @carleton.edu behave the same). Our access_log shows an initial "GET /" of our homepage, then very slow (if any) subsequent requests (for our stylesheet or homepage images). Ping's look fine; traceroute's look as reasonable. Telnet's to port 80 and other services do seem to respond, albeit very slowly.
User A has the same problem with access @carleton.edu but can access everything else (including other Onvoy customers) without any trouble whatsoever.
If User A then removes his Linksys router and connects his computer directly to the cable modem, he acquires an IP address in Comcast Subnet C. Then, everything works fine, including access to www.stolaf.edu and www.carleton.edu. He puts the Linksys router back in (which still has the IP address in Comcast Subnet B), and the problem returns.
The problem IP subnets are completely consistent.
Known WORKING IP Subnets: 75.72.0.0, 24.x Known NON-WORKING IP Subnets: 71.x, 73.x
--
We have already attempted the usual troubleshooting and have eliminated user problems, computer problems, server problems, cable modem problems, and Linksys router problems. Traceroutes have been somewhat inconclusive since Onvoy blocks ICMP within its network.
So, why just St. Olaf and Carleton services? We are on a shared physical link from Onvoy, though on different VLANs. Onvoy has verified everything they can (routing, packet loss, etc.) between them and us, and I'm not sure what additional questions I can ask of them to test. Suggestions?
Maybe Comcast has a broken transparent proxy on part(s) of their network? But they have told us they have nothing like this anywhere on their network.
Maybe there is some asymmetric routing somewhere, though all the investigation there has come up empty.
A third possibility is some kind of packet loss, but there is little if any evidence of that.
So, we are really at a loss and seek any suggestions you all might have. And a contact in Comcast network engineering would be especially useful to continue our troubleshooting.
With thanks, Craig -- Craig D. Rice Associate Director of Information Systems cdr@stolaf.edu Information and Instructional Technologies +1 507 786-3631 St. Olaf College +1 507 786-3096 FAX 1510 St. Olaf Avenue http://www.stolaf.edu/people/cdr Northfield, MN 55057-1097 USA
On 8/2/07, Craig D. Rice <cdr@stolaf.edu> wrote:
We have already attempted the usual troubleshooting and have eliminated user problems, computer problems, server problems, cable modem problems, and Linksys router problems. Traceroutes have been somewhat inconclusive since Onvoy blocks ICMP within its network.
Craig, This rings a bell. Do they block the mandatory ICMP fragmentation-needed messages? Try this: on your web server, reduce the MTU from 1500 bytes to 1400 bytes and see if the affected comcast users can now access your web server. Regards, Bill Herrin -- William D. Herrin herrin@dirtside.com bill@herrin.us 3005 Crane Dr. Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
At 09:30 AM 8/2/2007, Craig D. Rice wrote:
For four months dozens of our users who are Comcast subscribers have had difficulty reaching St. Olaf College's and Carleton College's network services.
We have worked through everything we can think of with our Onvoy (regional ISP) network engineers. We have isolated the problem a couple of Comcast's IP subnets, but need a contact within Comcast to further troubleshoot.
(snip) Either your firewall/router or the customer's firewall/router is blocking PMTUD packets. Fragment needed, but don't fragment bit set. Look at your ICMP access list and make sure you are allowing: permit icmp any any unreachable from any Internet address. I suspect an overzealous firewall admin is blocking all icmp. Read the acronym to him/her and explain that some icmp is necesary for the Internet to work. -Robert Tellurian Networks - Global Hosting Solutions Since 1995 http://www.tellurian.com | 888-TELLURIAN | 973-300-9211 "Well done is better than well said." - Benjamin Franklin
Robert Boyle wrote:
Either your firewall/router or the customer's firewall/router is blocking PMTUD packets..... I suspect an overzealous firewall admin is blocking all icmp.
Which you can't do anything about if the overzealous firewall admin is at the other end of the connection. My repeated, first-hand experience has been that several of the better-known web sites out there will happily send out 1500-byte packets with DF set, then ignore the DEST_UNREACH/FRAG_NEEDED icmp responses they get. If you're on the client end of this, you're sunk unless you initiate the connection specifying a lower MSS. Linux has a nifty iptables option (clamp-mss-to-pmtu) to rewrite the MSS in TCP SYN packets when forwarding a packet onto a link with a lower MTU than the MSS in the packet. Works like a charm. If every packet forwarding device on the Internet did this, PMTUD would not be needed. As is, PMTUD is simply broken, due to widespread firewall misconfiguration. As in so many other cases of Internet misbehavior, you can avoid being part of the problem, but you can't be the solution. Jim Shankland
On Thu, Aug 02, 2007, Jim Shankland wrote:
Linux has a nifty iptables option (clamp-mss-to-pmtu) to rewrite the MSS in TCP SYN packets when forwarding a packet onto a link with a lower MTU than the MSS in the packet. Works like a charm. If every packet forwarding device on the Internet did this, PMTUD would not be needed. As is, PMTUD is simply broken, due to widespread firewall misconfiguration. As in so many other cases of Internet misbehavior, you can avoid being part of the problem, but you can't be the solution.
.. non-TCP traffic? Adrian
Adrian Chadd wrote:
On Thu, Aug 02, 2007, Jim Shankland wrote:
Linux has a nifty iptables option (clamp-mss-to-pmtu) to rewrite the MSS in TCP SYN packets when forwarding a packet onto a link with a lower MTU than the MSS in the packet. Works like a charm. If every packet forwarding device on the Internet did this, PMTUD would not be needed. As is, PMTUD is simply broken, due to widespread firewall misconfiguration. As in so many other cases of Internet misbehavior, you can avoid being part of the problem, but you can't be the solution.
.. non-TCP traffic?
Hmm; I've never actually heard of anybody doing PMTUD on non-TCP traffic, though it's possible. Does anybody actually do it? Jim Shankland
On Thu, 02 Aug 2007 18:33:16 PDT, Jim Shankland said:
Hmm; I've never actually heard of anybody doing PMTUD on non-TCP traffic, though it's possible. Does anybody actually do it?
AIX 5.2 and earlier supported it for UDP (we're getting out of the AIX business, so I can't speak to what 5.3 does). Basically, it would send out a gratuitous 64K ICMP Echo Request with DF set, and waited to see what came back. I ended up turning it off all over, simply because we didn't have enough UDP-based services that actually hit frag issues to make a difference. --- 'man no' (Network Options) says: no { -a | -d Attribute | -o Attribute [ =NewValue ] } udp_pmtu_discover Enables or disables path MTU discovery for UDP applications. UDP applications must be specifically written to utilize path MTU discovery. A value of 0 disables the feature, while a value of 1 enables it. This attribute only applies to AIX 4.2.1 or later. udp_pmtu_discover is a runtime attribute. In versions prior to AIX 4.3.3, the default value is 0 (disabled); in AIX 4.3.3 and later versions, the default value is 1 (enabled). --- The manpage lies - It has to be specifically written to *benefit from* PMTUD. It would go ahead and do it, and then 98% of the UDP programs wouldn't change their behavior. So all you got was lots of gratuitous ICMP mobygrams. It *may* have made a small difference during a short window, when NFS-over-TCP support was still rare, and the 4500-octet FDDI MTU was sometimes to be found.
On 8/2/07, Valdis.Kletnieks@vt.edu <Valdis.Kletnieks@vt.edu> wrote:
AIX 5.2 and earlier supported it for UDP (we're getting out of the AIX business, so I can't speak to what 5.3 does). Basically, it would send out a gratuitous 64K ICMP Echo Request with DF set, and waited to see what came back.
AIX 5.3 changes the whole PMTU scheme to work more like the RFCs and not depend on out of band ICMP packets.
From http://www.ibm.com/developerworks/aix/library/au-aix5l-me.html :
PMTU discovery in AIX 5L Version 5.3 The current Path Maximum Transmission Unit (PMTU) discovery implementation uses ICMP Echo Request and ICMP Echo Reply packets to discover PMTU. Some system administrators set up their firewall to drop ICMP Echo packets, resulting in the above method of PMTU discovery to fail. The PMTU discovery mechanism in AIX 5L Version 5.3 is implemented with TCP packets and UDP datagram instead of ICMP Echo packets.
participants (9)
-
Adrian Chadd
-
Craig D. Rice
-
Duane Waddle
-
Durand, Alain
-
Jim Shankland
-
Robert Boyle
-
Sean Donelan
-
Valdis.Kletnieks@vt.edu
-
William Herrin