I have run into a problem that has me completely stumped, so I'm tossing it out to NANOG for some help. Before I lay out the specifics, I'm not trying to point fingers at any particular ISP or vendor here, but this problem only exhibits itself in very specific configurations. Unfortunately, the configuration is common enough as to get unwanted attention from the higher-ups. Here's the particulars: Users that have Verizon DSL and a Linksys cable/DSL router have difficulties accessing sites on my network -- whether they are trying with http, https, smtp, pop3, ssh, ftp, etc., etc. Oh, but pings seem to be fine. Low latency, no loss. This is true even for access to a server brought up in the DMZ, to keep the firewalls out of the equation. Doing some packet sniffing on the ethernet side of my router, I could see specific http requests never showed up (and the user saw the broken image icon). This was for an mrtg graph page with +/- 30 images. I saw the request for almost all the image files, save for one and the user reported the broken image icon for the one. So this looks and smells like a packet loss issue..... but who/where/how? Taking the Linksys out of the pictures (connecting their PC directly to the Verizon DSL modem) makes the problem go away. These same users report no trouble whatsoever accessing many other common sites across the internet. Here's another interesting data point: when one user runs Morpheus (on any machine in his home network) he then has absolutely no problems accessing servers/services on my network. Other users with Linksys routers and, say cable modem, do not have this problem! So I'm looking for some pointers. What could I have done to my edge router (a Cisco 3640 if that helps any) that would make it drop packets from Verizon DSL customers with Linksys routers so long as they aren't running Morpheus? Mark J. Scheller (scheller@u1.net)
Are there sub-1500 byte MTUs anywhere and is one of the devices (Linksys?) dropping the relevant icmp fragments? Morpheus might be working by not having DF bit set.. just a possibility test by removing any filtering of icmp Steve On Tue, 21 Jan 2003, Mark J. Scheller wrote:
I have run into a problem that has me completely stumped, so I'm tossing it out to NANOG for some help.
Before I lay out the specifics, I'm not trying to point fingers at any particular ISP or vendor here, but this problem only exhibits itself in very specific configurations. Unfortunately, the configuration is common enough as to get unwanted attention from the higher-ups.
Here's the particulars:
Users that have Verizon DSL and a Linksys cable/DSL router have difficulties accessing sites on my network -- whether they are trying with http, https, smtp, pop3, ssh, ftp, etc., etc. Oh, but pings seem to be fine. Low latency, no loss. This is true even for access to a server brought up in the DMZ, to keep the firewalls out of the equation.
Doing some packet sniffing on the ethernet side of my router, I could see specific http requests never showed up (and the user saw the broken image icon). This was for an mrtg graph page with +/- 30 images. I saw the request for almost all the image files, save for one and the user reported the broken image icon for the one. So this looks and smells like a packet loss issue..... but who/where/how?
Taking the Linksys out of the pictures (connecting their PC directly to the Verizon DSL modem) makes the problem go away.
These same users report no trouble whatsoever accessing many other common sites across the internet.
Here's another interesting data point: when one user runs Morpheus (on any machine in his home network) he then has absolutely no problems accessing servers/services on my network.
Other users with Linksys routers and, say cable modem, do not have this problem!
So I'm looking for some pointers. What could I have done to my edge router (a Cisco 3640 if that helps any) that would make it drop packets from Verizon DSL customers with Linksys routers so long as they aren't running Morpheus?
Mark J. Scheller (scheller@u1.net)
Could this be a packet size issue ? You might try ping -s and see if, say, 1500 byte and 4500 byte packets get through.m On Tuesday, January 21, 2003, at 05:25 PM, Mark J. Scheller wrote:
I have run into a problem that has me completely stumped, so I'm tossing it out to NANOG for some help.
Before I lay out the specifics, I'm not trying to point fingers at any particular ISP or vendor here, but this problem only exhibits itself in very specific configurations. Unfortunately, the configuration is common enough as to get unwanted attention from the higher-ups.
Here's the particulars:
Users that have Verizon DSL and a Linksys cable/DSL router have difficulties accessing sites on my network -- whether they are trying with http, https, smtp, pop3, ssh, ftp, etc., etc. Oh, but pings seem to be fine. Low latency, no loss. This is true even for access to a server brought up in the DMZ, to keep the firewalls out of the equation.
Doing some packet sniffing on the ethernet side of my router, I could see specific http requests never showed up (and the user saw the broken image icon). This was for an mrtg graph page with +/- 30 images. I saw the request for almost all the image files, save for one and the user reported the broken image icon for the one. So this looks and smells like a packet loss issue..... but who/where/how?
Taking the Linksys out of the pictures (connecting their PC directly to the Verizon DSL modem) makes the problem go away.
These same users report no trouble whatsoever accessing many other common sites across the internet.
Here's another interesting data point: when one user runs Morpheus (on any machine in his home network) he then has absolutely no problems accessing servers/services on my network.
Other users with Linksys routers and, say cable modem, do not have this problem!
So I'm looking for some pointers. What could I have done to my edge router (a Cisco 3640 if that helps any) that would make it drop packets from Verizon DSL customers with Linksys routers so long as they aren't running Morpheus?
Mark J. Scheller (scheller@u1.net)
Regards Marshall Eubanks T.M. Eubanks Multicast Technologies, Inc 10301 Democracy Lane, Suite 410 Fairfax, Virginia 22030 Phone : 703-293-9624 Fax : 703-293-9609 e-mail : tme@multicasttech.com http://www.multicasttech.com Test your network for multicast : http://www.multicasttech.com/mt/ Status of Multicast on the Web : http://www.multicasttech.com/status/index.html
On Tue, 21 Jan 2003, Mark J. Scheller wrote: :: Here's the particulars: :: :: Users that have Verizon DSL and a Linksys cable/DSL router have :: difficulties accessing sites on my network -- whether they are trying :: with http, https, smtp, pop3, ssh, ftp, etc., etc. Oh, but pings :: seem to be fine. Low latency, no loss. This is true even for access :: to a server brought up in the DMZ, to keep the firewalls out of the :: equation. :: Have the user update their linksys firmware. I see this problem all the time. Linksys soho gateways are notorious for their early firmware not sending fragments with proper headers. Any acl that does not allow *all frags* by default will deny their packets. There may be other issues as well, but the firmware update tends to fix all of the problems. -jba __ [jba@analogue.net] :: analogue.networks.nyc :: http://analogue.net
Definitely sounds like an MTU problem. I have seen IPSEC break across Verizon DSL with a Linksys router until the MTU on the ?PCs?" where dropped to just under 1500 bytes to allow for the IPSEC header. DJ
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu]On Behalf Of Mark J. Scheller Sent: Tuesday, January 21, 2003 5:26 PM To: nanog@merit.edu Subject: Stumper
I have run into a problem that has me completely stumped, so I'm tossing it out to NANOG for some help.
Before I lay out the specifics, I'm not trying to point fingers at any particular ISP or vendor here, but this problem only exhibits itself in very specific configurations. Unfortunately, the configuration is common enough as to get unwanted attention from the higher-ups.
Here's the particulars:
Users that have Verizon DSL and a Linksys cable/DSL router have difficulties accessing sites on my network -- whether they are trying with http, https, smtp, pop3, ssh, ftp, etc., etc. Oh, but pings seem to be fine. Low latency, no loss. This is true even for access to a server brought up in the DMZ, to keep the firewalls out of the equation.
Doing some packet sniffing on the ethernet side of my router, I could see specific http requests never showed up (and the user saw the broken image icon). This was for an mrtg graph page with +/- 30 images. I saw the request for almost all the image files, save for one and the user reported the broken image icon for the one. So this looks and smells like a packet loss issue..... but who/where/how?
Taking the Linksys out of the pictures (connecting their PC directly to the Verizon DSL modem) makes the problem go away.
These same users report no trouble whatsoever accessing many other common sites across the internet.
Here's another interesting data point: when one user runs Morpheus (on any machine in his home network) he then has absolutely no problems accessing servers/services on my network.
Other users with Linksys routers and, say cable modem, do not have this problem!
So I'm looking for some pointers. What could I have done to my edge router (a Cisco 3640 if that helps any) that would make it drop packets from Verizon DSL customers with Linksys routers so long as they aren't running Morpheus?
Mark J. Scheller (scheller@u1.net)
The Linksys does have an MTU setting, and I've had my users try some lower settings to see if it made any differences. One user set the MTU on the Linksys as low as 1200 with no noticeable improvement. Anything else I should look at? mS (scheller@u1.net)
If the MTU is not helping then go get the latest firmware. Also you cannot use port forwarding in most linksys routers with DHCP enabled. For those routers you have to set everyone statically and turn of DHCP for port forwarding to work. Mark J. Scheller wrote:
The Linksys does have an MTU setting, and I've had my users try some lower settings to see if it made any differences. One user set the MTU on the Linksys as low as 1200 with no noticeable improvement.
Anything else I should look at?
mS (scheller@u1.net)
-- May God Bless you and everything you touch. My "foundation" verse: Isaiah 54:17 No weapon that is formed against thee shall prosper; and every tongue that shall rise against thee in judgment thou shalt condemn. This is the heritage of the servants of the LORD, and their righteousness is of me, saith the LORD.
This would depend upon the direction of the packets that are dropped and where the broken device is. If the 1500 byte packets are coming in from the Internet and the Linksys needs to forward onto a smaller MTU media but finds the DF bit set it will return an icmp fragment.. if this icmp is then dropped back at the client then you'll see what you describe. If the Linksys or device infront of it will allow remove the DF bit from inbound packets. Steve On Tue, 21 Jan 2003, Mark J. Scheller wrote:
The Linksys does have an MTU setting, and I've had my users try some lower settings to see if it made any differences. One user set the MTU on the Linksys as low as 1200 with no noticeable improvement.
Anything else I should look at?
mS (scheller@u1.net)
On Tue, 21 Jan 2003, Mark J. Scheller wrote:
The Linksys does have an MTU setting, and I've had my users try some lower settings to see if it made any differences. One user set the MTU on the Linksys as low as 1200 with no noticeable improvement.
If you're using path MTU discovery (in other words, sending out packets with the DF bit set), it works like this: The host on each end of the connection has an MTU configured in its TCP stack, so on initial connection (generally with very small syn/ack packets), the packet size gets negotiated and set to the lower of those two numbers. If all the router interfaces in between have an MTU equal or greater than the MTU that gets negotiated between the hosts, packets will continue to flow at that size without incident. In general, when you're dealing with two ethernet connected hosts with MTUs of 1500 bytes, and a bunch of routers in between with MTUs of greater than 1500, this is what happens. However, if there's a network link in the path with an MTU smaller than the MTUs of the two end devices, the large packets sent by the end devices won't be able to pass through that link. Instead, the router with the small MTU link sends an ICMP response back to the sending host, requesting smaller packets. The sending host retries with progressively smaller packets until arriving at a size that works. Therefore, I think the scenario that people are describing here is this: The user's computer is talking to the Linksys across a regular ethernet with an MTU of 1500. The host on your network probably also has an MTU of 1500. The Linksys is talking to the DSL provider via PPPoE, and thus has an MTU of 1492. The connection starts out with an initial MTU of 1500 in each direction, but 1500 byte packets can't pass through the 1492 byte MTU of the connection between the Linksys and the DSL provider. Therefore, the devices on the two ends of that link would be sending back ICMP messages requesting smaller packets. If all ICMP were being blocked somewhere, those ICMP messages wouldn't arrive, and the host that wasn't receiving them would keep obliviously sending out 1500 byte packets until the connection timed out. But, if you were plugging the client computers directly into the DSL line and running PPPoE on them, you'd have the 1492 byte MTU negotiated from the start and everything would work. In this scenario, decreasing the Linksys's MTU wouldn't help you, because the problem would already be that the MTU on the Linksys was smaller than the MTU on the end points. Decreasing the MTU on the end points would help. What would help even more would be fixing the ICMP filtering. Now that I've said all that, this scenario doesn't really fit what you're seeing. You said your packet sniffer showed no packets coming across, but TCP connections don't generally start out with 1500 byte packets. In general, when you see an path MTU discovery issue, you see the connection being successfully opened, small packets (containing such small bits of data as "GET /") flowing freely, and then the connection freezing when a big burst of data gets sent for the first time. Since that's not what you're seeing here, I'm more inclined to agree with those who have suggested upgrading the Linksys's firmware. I don't have any experience with that -- The Linksys NAT box on my home network works fine and I haven't had any reason to mess with it -- but it does seem like a far more plausible explanation for what you're seeing. -Steve -------------------------------------------------------------------------------- Steve Gibbard scg@gibbard.org +1 510 528-1263 http://www.gibbard.org/~scg
MTU on user-end shouldn't really be an issue here.. B/c if so, then (I am only assuming this) how could they access other sites like yahoo.com, etc? I am sure your web site is no different than other common ones. Linksys routers have various issues. The best bet is to go after the firmware and make sure its up-to-date. -- but yet they have no problems accessing other sites?? hmm. This is probably not the cause of the issue but just in case --- You may wanna check to make sure that your server does not have ECN enabled. I've experienced some firewalls/internet sharing devices misbehaving whenever trying to connect to an ECN-enabled server. Again, this is probably not it, but just one of the things to try out, if you run out of other clues... -hc Mark J. Scheller wrote:
I have run into a problem that has me completely stumped, so I'm tossing it out to NANOG for some help.
Before I lay out the specifics, I'm not trying to point fingers at any particular ISP or vendor here, but this problem only exhibits itself in very specific configurations. Unfortunately, the configuration is common enough as to get unwanted attention from the higher-ups.
Here's the particulars:
Users that have Verizon DSL and a Linksys cable/DSL router have difficulties accessing sites on my network -- whether they are trying with http, https, smtp, pop3, ssh, ftp, etc., etc. Oh, but pings seem to be fine. Low latency, no loss. This is true even for access to a server brought up in the DMZ, to keep the firewalls out of the equation.
Doing some packet sniffing on the ethernet side of my router, I could see specific http requests never showed up (and the user saw the broken image icon). This was for an mrtg graph page with +/- 30 images. I saw the request for almost all the image files, save for one and the user reported the broken image icon for the one. So this looks and smells like a packet loss issue..... but who/where/how?
Taking the Linksys out of the pictures (connecting their PC directly to the Verizon DSL modem) makes the problem go away.
These same users report no trouble whatsoever accessing many other common sites across the internet.
Here's another interesting data point: when one user runs Morpheus (on any machine in his home network) he then has absolutely no problems accessing servers/services on my network.
Other users with Linksys routers and, say cable modem, do not have this problem!
So I'm looking for some pointers. What could I have done to my edge router (a Cisco 3640 if that helps any) that would make it drop packets from Verizon DSL customers with Linksys routers so long as they aren't running Morpheus?
Mark J. Scheller (scheller@u1.net)
On Tue, Jan 21, 2003 at 08:06:07PM -0500, hc wrote:
MTU on user-end shouldn't really be an issue here.. B/c if so, then (I am only assuming this) how could they access other sites like yahoo.com, etc? I am sure your web site is no different than other common ones.
Well, you're forgetting that odd things tend to happen if MTU on one side of the connection doesn't agree with the other side of the link. (MTU is not a function of the transmitter or the receiver but, rather, a function of the link you're operating on.) We all know what kind of screwy things can happen when they disagree. (Try holding a BGP link up over a connection where they differ... t'ain't easy.) One other possibility here: I once had to deal with a problem where a particular link would receive and send data, apparently, just fine. There were errors counting up one one end, yes, but slowly enough that it didn't seem to indicate a real source of a problem. Well, DNS queries would go out just fine but the responses never made it back over this link. As I said, once connected via IP instead of domain name, everything seemed to progress just fine. After much head scratching and a complete visual inspection of every device in the circuit, turns out that two pieces of gear in the middle were misoptioned. Very weird but it did happen. Not that I suspect that to be the problem here (I'm firmly in the MTU court on this one until evidence shows otherwise) but it is a possibility.. -Wayne
we used to have that problem here. a big customer from us does many gre tunnels. the problem seemed to be that they were blocking icmp, thus every mtu variation on the way from any point could not be known by the routers making the point unavailable (we actually saw the packets just before entering the tunnel). try this, ping with different packet size and you will find this problem. solution to the problem was to allow the icmp dunr type packets. On 21 Jan 2003 at 17:25, Mark J. Scheller wrote:
I have run into a problem that has me completely stumped, so I'm tossing it out to NANOG for some help.
Before I lay out the specifics, I'm not trying to point fingers at any particular ISP or vendor here, but this problem only exhibits itself in very specific configurations. Unfortunately, the configuration is common enough as to get unwanted attention from the higher-ups.
Here's the particulars:
Users that have Verizon DSL and a Linksys cable/DSL router have difficulties accessing sites on my network -- whether they are trying with http, https, smtp, pop3, ssh, ftp, etc., etc. Oh, but pings seem to be fine. Low latency, no loss. This is true even for access to a server brought up in the DMZ, to keep the firewalls out of the equation.
Doing some packet sniffing on the ethernet side of my router, I could see specific http requests never showed up (and the user saw the broken image icon). This was for an mrtg graph page with +/- 30 images. I saw the request for almost all the image files, save for one and the user reported the broken image icon for the one. So this looks and smells like a packet loss issue..... but who/where/how?
Taking the Linksys out of the pictures (connecting their PC directly to the Verizon DSL modem) makes the problem go away.
These same users report no trouble whatsoever accessing many other common sites across the internet.
Here's another interesting data point: when one user runs Morpheus (on any machine in his home network) he then has absolutely no problems accessing servers/services on my network.
Other users with Linksys routers and, say cable modem, do not have this problem!
So I'm looking for some pointers. What could I have done to my edge router (a Cisco 3640 if that helps any) that would make it drop packets from Verizon DSL customers with Linksys routers so long as they aren't running Morpheus?
Mark J. Scheller (scheller@u1.net)
-- Miguel Mata-Cardona Intercom El Salvador mmata@sv.intercomnet.net
Sounds similar to my problem with the Linksys cable/DSL router. My problem was that it would work perfectly with NAT enabled, but the minute I turned NAT off, I couldn't get to a lot of sites. I tried a number of firmwares. I even tried to get support from Linksys. But, after a week without any returned phone calls, I returned the unit. I do know that there is a working firmware for this configuration, but there is no information that I could find on down-reving the unit. My solution was to get rid of the POS and use one of my Linux servers to do the pppoe. Thanks, Dennis
participants (12)
-
Deepak Jain
-
Dennis Boylan
-
fkittred@gwi.net
-
hc
-
jeffrey.arnold
-
Mark J. Scheller
-
Marshall Eubanks
-
Miguel Mata-Cardona
-
Stephen J. Wilcox
-
Steve Gibbard
-
Wayne E. Bouchard
-
William Warren