On Thursday, 2001/08/02 at 00:49 AST, Valdis.Kletnieks@vt.edu wrote:
On Wed, 01 Aug 2001 16:26:55 PDT, Tony Rall <trall@almaden.ibm.com> said:
echo). This probably makes PMTUD work a lot better, but it sucks for ICMP
Or totally horques it up entirely if the actual data path used has a different PMTU. No way this will work if 9 paths are clean and one requires a frag. ;)
I won't discuss what to do if you get back 10 FRAG NEEDED packets, with differing frag sizes ;)
That's not the way it works. You've got a load-balancing-system (LBS) front-ending (using a single IP address) a cluster 10 web servers. A client on the other side of the LBS initiates an 80/tcp connection. The LBS directs it to (let's say) server 6. Once data starts flowing on the connection, nothing interesting happens until server 6 sends a large packet to the client with (as on all of its packets) the Don't-frag flag on. That packet reaches a link with a smaller mtu. The router on that link returns (to the server complex) and ICMP unreachable, fragmentation needed packet (type 3, code 4). That ICMP reaches the LBS; it has to decide what to do with it. Some LBSs will just discard any ICMP packets addressed to the cluster. The one used by MS instead forwards it to all the back-end servers. The servers that don't have a session with the client may just discard the ICMP packet (or they may simply update info in their routing table (I know that Aix does that)). The server that does have a session with the client will repackage his data packet (per the newly learned mtu) and send the smaller packet. The path between the chosen server to the client is no more ambiguous than any other PMTUD situation. Which is to say that, yes, the path could change from packet to packet, but that isn't brought on by the presence of the LBS, it's just a shortcoming of the PMTUD mechanism. In fact outbound traffic from the clustered servers often doesn't even go through the LBS. (Note that if the client is also using PMTUD and happens to send a large enough packet to trigger it, the only ICMP unreachable sent would be towards the client. Even if the link mtu causing the unreachable is on the server side of the LBS, there will be only one unreachable sent - no ambiguity at all.) (Also note that it isn't necessary for an LBS to forward all ICMPs to make PMTUD work. It just has to forward the unreachable, fragmentation needed packets. And it doesn't have to forward those to all the back-end servers. There is enough info in the unreachable message to determine which connection this ICMP message relates to - the 80/tcp connection between the client and server 6. So the LBS could know that this ICMP only has to be forwarded to server 6. I don't know of any LBS that is smart enough to handle it this way.) Tony Rall
The Windows Load Balancing Service doesn't use a front-end/back-end approach since that has obvious scale limitations, and last I knew each of those addresses actually pointed at ~ 32 machines, but that may have doubled a couple of times by now. It is supposed to be smart enough to prevent dup's, so the fact you are seeing them either indicates brokenness, or more likely a cluster in state transition. -----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu]On Behalf Of Tony Rall Sent: Thursday, August 02, 2001 7:25 AM To: Valdis.Kletnieks@vt.edu Cc: nanog@merit.edu Subject: Re: MicroSoft amplification? On Thursday, 2001/08/02 at 00:49 AST, Valdis.Kletnieks@vt.edu wrote:
On Wed, 01 Aug 2001 16:26:55 PDT, Tony Rall <trall@almaden.ibm.com> said:
echo). This probably makes PMTUD work a lot better, but it sucks for ICMP
Or totally horques it up entirely if the actual data path used has a different PMTU. No way this will work if 9 paths are clean and one requires a frag. ;)
I won't discuss what to do if you get back 10 FRAG NEEDED packets, with differing frag sizes ;)
That's not the way it works. You've got a load-balancing-system (LBS) front-ending (using a single IP address) a cluster 10 web servers. A client on the other side of the LBS initiates an 80/tcp connection. The LBS directs it to (let's say) server 6. Once data starts flowing on the connection, nothing interesting happens until server 6 sends a large packet to the client with (as on all of its packets) the Don't-frag flag on. That packet reaches a link with a smaller mtu. The router on that link returns (to the server complex) and ICMP unreachable, fragmentation needed packet (type 3, code 4). That ICMP reaches the LBS; it has to decide what to do with it. Some LBSs will just discard any ICMP packets addressed to the cluster. The one used by MS instead forwards it to all the back-end servers. The servers that don't have a session with the client may just discard the ICMP packet (or they may simply update info in their routing table (I know that Aix does that)). The server that does have a session with the client will repackage his data packet (per the newly learned mtu) and send the smaller packet. The path between the chosen server to the client is no more ambiguous than any other PMTUD situation. Which is to say that, yes, the path could change from packet to packet, but that isn't brought on by the presence of the LBS, it's just a shortcoming of the PMTUD mechanism. In fact outbound traffic from the clustered servers often doesn't even go through the LBS. (Note that if the client is also using PMTUD and happens to send a large enough packet to trigger it, the only ICMP unreachable sent would be towards the client. Even if the link mtu causing the unreachable is on the server side of the LBS, there will be only one unreachable sent - no ambiguity at all.) (Also note that it isn't necessary for an LBS to forward all ICMPs to make PMTUD work. It just has to forward the unreachable, fragmentation needed packets. And it doesn't have to forward those to all the back-end servers. There is enough info in the unreachable message to determine which connection this ICMP message relates to - the 80/tcp connection between the client and server 6. So the LBS could know that this ICMP only has to be forwarded to server 6. I don't know of any LBS that is smart enough to handle it this way.) Tony Rall
participants (2)
-
Tony Hain
-
Tony Rall