Hey folks, I'm sure to you it's peanuts, but I'm a bit puzzled (most likely because of the lack of knowledge, I bet). I'm buying an IP backbone from VNZ (presumably MPLS). I get a MLPPP hand off on all sites, so I don't do the actual labeling and switching, so I guess for practical purposes what I'm trying to say is that I have no physical control over the other side of my MLPPP links. When I transfer a large file over FTP (or CIFS, or anything else), I'd expect it to max out either one or both T1, but instead utilization on the T1s is hoovering at 70% on both and sometimes MLPPP link utilization even drops below 50%. What am I'm not gettting here? Tx, Andrey Below is a snip of my config. controller T1 0/0/0 cablelength long 0db channel-group 1 timeslots 1-24 ! controller T1 0/0/1 cablelength long 0db channel-group 1 timeslots 1-24 ! ip nbar custom rdesktop tcp 3389 ip cef ! class-map match-any VoIP match dscp ef class-map match-any interactive match protocol rdesktop match protocol telnet match protocol ssh ! policy-map QWAS class VoIP priority 100 class interactive bandwidth 500 class class-default fair-queue 4096 ! interface Multilink1 description Verizon Business MPLS Circuit ip address x.x.x.150 255.255.255.252 ip flow ingress ip nat inside ip virtual-reassembly load-interval 30 no peer neighbor-route ppp chap hostname R1 ppp multilink ppp multilink links minimum 1 ppp multilink group 1 ppp multilink fragment disable service-policy output QWAS ! interface Serial0/0/0:1 no ip address ip flow ingress encapsulation ppp load-interval 30 fair-queue 4096 256 0 ppp chap hostname R1 ppp multilink ppp multilink group 1 ! interface Serial0/0/1:1 no ip address ip flow ingress encapsulation ppp load-interval 30 fair-queue 4096 256 0 ppp chap hostname R1 ppp multilink ppp multilink group 1 ----- Andrey Gordon [andrey.gordon@gmail.com]
Andrey Gordon wrote:
Hey folks, I'm sure to you it's peanuts, but I'm a bit puzzled (most likely because of the lack of knowledge, I bet).
I'm buying an IP backbone from VNZ (presumably MPLS). I get a MLPPP hand off on all sites, so I don't do the actual labeling and switching, so I guess for practical purposes what I'm trying to say is that I have no physical control over the other side of my MLPPP links.
When I transfer a large file over FTP (or CIFS, or anything else), I'd expect it to max out either one or both T1, but instead utilization on the T1s is hoovering at 70% on both and sometimes MLPPP link utilization even drops below 50%. What am I'm not gettting here?
I seem to be in a similar situation as you (but with AT&T). I have not noticed any unexpected missing bandwidth. I don't see any specific problems with your config, but I'll include mine in hopes it will be useful: controller T1 0/0/0 framing esf linecode b8zs channel-group 0 timeslots 1-24 ! controller T1 0/0/1 framing esf linecode b8zs channel-group 0 timeslots 1-24 ! class-map match-any imaging match access-group 112 class-map match-any rdp match access-group 113 class-map match-any voice match ip dscp ef match access-group 110 ! policy-map private_wan class voice priority percent 60 set ip dscp ef class rdp bandwidth percent 32 set ip dscp af31 class imaging bandwidth percent 4 set ip dscp af21 class class-default bandwidth percent 4 set ip dscp default ! interface Multilink1 ip address x.x.x.38 255.255.255.252 no keepalive no cdp enable ppp chap hostname xxxxxxx ppp multilink ppp multilink fragment disable ppp multilink group 1 max-reserved-bandwidth 100 service-policy output private_wan ! interface Serial0/0/0:0 no ip address encapsulation ppp no cdp enable ppp chap hostname xxxxxxxxxx ppp multilink ppp multilink group 1 max-reserved-bandwidth 100 ! interface Serial0/0/1:0 no ip address encapsulation ppp no cdp enable ppp chap hostname xxxxxxxxxx ppp multilink ppp multilink group 1 max-reserved-bandwidth 100 ! access-list 110 permit ip any 10.0.0.0 0.0.255.255 access-list 110 permit ip 10.0.0.0 0.0.255.255 any access-list 110 permit icmp any any access-list 112 permit ip any host x.x.x.x access-list 113 permit ip any host x.x.x.x
Gents, On Mon, May 11, 2009 at 10:54 AM, Dan White <dwhite@olp.net> wrote:
Andrey Gordon wrote:
[snip]
When I transfer a large file over FTP (or CIFS, or anything else), I'd expect it to max out either one or both T1, but instead utilization on the T1s is hoovering at 70% on both and sometimes MLPPP link utilization even drops below 50%. What am I'm not gettting here?
Sounds like the TCP window is either set 'small' or TCP window scaling either isn't enabled or isn't scaling to your bandwidth/delay product (for the hosts in question). Since FTP is a 'stream' based transport of file data (like http), you should see this scale to nearly all of or most of your links (assuming TCP isn't your issue). Additionally, when using CIFS, SMB, TFTP, NFS, and other command->acknowledgment style protocols over wide-area links (which aren't stream-based operations, but rather iterative operations on blocks or parts of a file), you likely will never observe a single transfer filling up the links. -Tk
On Mon, May 11, 2009 at 10:37:25AM -0400, Andrey Gordon wrote:
Hey folks, I'm sure to you it's peanuts, but I'm a bit puzzled (most likely because of the lack of knowledge, I bet).
I'm buying an IP backbone from VNZ (presumably MPLS). I get a MLPPP hand off on all sites, so I don't do the actual labeling and switching, so I guess for practical purposes what I'm trying to say is that I have no physical control over the other side of my MLPPP links.
When I transfer a large file over FTP (or CIFS, or anything else), I'd expect it to max out either one or both T1,
Most MLPPP implementations don't has the flows at the IP layer to an individual MLPPP member link. The bundle is a virtual L3 interface and the packets themselves are distributed over the member links. Some people reference it as a "load balancing" scenario vs. "load sharing" as the traffic is given to the link that isn't currently "busy". but instead utilization on the
T1s is hoovering at 70% on both and sometimes MLPPP link utilization even drops below 50%. What am I'm not gettting here?
If you have Multilink fragmentation disabled it sends a packet down each path. It could be a reordering delay causing just enough variance in the packet stream that the application thorttles back. If you have a bunch of individual streams going you would probably see a higher throughput. Remember there is additional overhead for the MLPPP. Rodney
Tx, Andrey
Below is a snip of my config.
controller T1 0/0/0 cablelength long 0db channel-group 1 timeslots 1-24 ! controller T1 0/0/1 cablelength long 0db channel-group 1 timeslots 1-24 ! ip nbar custom rdesktop tcp 3389 ip cef ! class-map match-any VoIP match dscp ef class-map match-any interactive match protocol rdesktop match protocol telnet match protocol ssh ! policy-map QWAS class VoIP priority 100 class interactive bandwidth 500 class class-default fair-queue 4096 ! interface Multilink1 description Verizon Business MPLS Circuit ip address x.x.x.150 255.255.255.252 ip flow ingress ip nat inside ip virtual-reassembly load-interval 30 no peer neighbor-route ppp chap hostname R1 ppp multilink ppp multilink links minimum 1 ppp multilink group 1 ppp multilink fragment disable service-policy output QWAS ! interface Serial0/0/0:1 no ip address ip flow ingress encapsulation ppp load-interval 30 fair-queue 4096 256 0 ppp chap hostname R1 ppp multilink ppp multilink group 1 ! interface Serial0/0/1:1 no ip address ip flow ingress encapsulation ppp load-interval 30 fair-queue 4096 256 0 ppp chap hostname R1 ppp multilink ppp multilink group 1
----- Andrey Gordon [andrey.gordon@gmail.com]
I would also think the problem is with flow control not allowing the maximum bandwidth. Trying multiple ftp streams and seeing if that would max it out would help. I would think you would want to add a WRED to the class-default entry to prevent global tcp synchronization ... class class-default fair-queue 4096 random-detect dscp-based ---- Matthew Huff | One Manhattanville Rd OTA Management LLC | Purchase, NY 10577 http://www.ox.com | Phone: 914-460-4039 aim: matthewbhuff | Fax: 914-460-4139 -----Original Message----- From: Rodney Dunn [mailto:rodunn@cisco.com] Sent: Monday, May 11, 2009 12:06 PM To: Andrey Gordon Cc: nanog@nanog.org Subject: Re: PPP multilink help On Mon, May 11, 2009 at 10:37:25AM -0400, Andrey Gordon wrote:
Hey folks, I'm sure to you it's peanuts, but I'm a bit puzzled (most likely because of the lack of knowledge, I bet).
I'm buying an IP backbone from VNZ (presumably MPLS). I get a MLPPP hand off on all sites, so I don't do the actual labeling and switching, so I guess for practical purposes what I'm trying to say is that I have no physical control over the other side of my MLPPP links.
When I transfer a large file over FTP (or CIFS, or anything else), I'd expect it to max out either one or both T1,
Most MLPPP implementations don't has the flows at the IP layer to an individual MLPPP member link. The bundle is a virtual L3 interface and the packets themselves are distributed over the member links. Some people reference it as a "load balancing" scenario vs. "load sharing" as the traffic is given to the link that isn't currently "busy". but instead utilization on the
T1s is hoovering at 70% on both and sometimes MLPPP link utilization even drops below 50%. What am I'm not gettting here?
If you have Multilink fragmentation disabled it sends a packet down each path. It could be a reordering delay causing just enough variance in the packet stream that the application thorttles back. If you have a bunch of individual streams going you would probably see a higher throughput. Remember there is additional overhead for the MLPPP. Rodney
Tx, Andrey
Below is a snip of my config.
controller T1 0/0/0 cablelength long 0db channel-group 1 timeslots 1-24 ! controller T1 0/0/1 cablelength long 0db channel-group 1 timeslots 1-24 ! ip nbar custom rdesktop tcp 3389 ip cef ! class-map match-any VoIP match dscp ef class-map match-any interactive match protocol rdesktop match protocol telnet match protocol ssh ! policy-map QWAS class VoIP priority 100 class interactive bandwidth 500 class class-default fair-queue 4096 ! interface Multilink1 description Verizon Business MPLS Circuit ip address x.x.x.150 255.255.255.252 ip flow ingress ip nat inside ip virtual-reassembly load-interval 30 no peer neighbor-route ppp chap hostname R1 ppp multilink ppp multilink links minimum 1 ppp multilink group 1 ppp multilink fragment disable service-policy output QWAS ! interface Serial0/0/0:1 no ip address ip flow ingress encapsulation ppp load-interval 30 fair-queue 4096 256 0 ppp chap hostname R1 ppp multilink ppp multilink group 1 ! interface Serial0/0/1:1 no ip address ip flow ingress encapsulation ppp load-interval 30 fair-queue 4096 256 0 ppp chap hostname R1 ppp multilink ppp multilink group 1
----- Andrey Gordon [andrey.gordon@gmail.com]
To address the concerns about the overhead (FTP is still transferring that file: core.bvzn#sh proc cpu hist core.bvzn 12:44:07 PM Monday May 11 2009 EST 333344444222222222222222222223333333333222222222233333222222 100 90 80 70 60 50 40 30 20 10 0....5....1....1....2....2....3....3....4....4....5....5....6 0 5 0 5 0 5 0 5 0 5 0 CPU% per second (last 60 seconds) 4 4 333334344353344455444444444554544445455664445551444445454544 100 90 80 70 60 50 40 * * 30 * * 20 * * 10 * ** ** * * ****# ***# * * * 0....5....1....1....2....2....3....3....4....4....5....5....6 0 5 0 5 0 5 0 5 0 5 0 CPU% per minute (last 60 minutes) * = maximum CPU% # = average CPU% 1 433433344444434344444444443344444433444443444433333333434444344334301332 497289281236443538550242449336950644007664423513486377362431706922208088 100 * 90 * 80 * 70 * 60 * 50 * * *** *** * *** * * * 40 ****** ****** ************* ****** ***** **** ** ************ * * * 30 ******************************************************************** *** 20 *******************************************************************#**** 10 *******************************************************************#**** 0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.. 0 5 0 5 0 5 0 5 0 5 0 5 0 CPU% per hour (last 72 hours) * = maximum CPU% # = average CPU% core.bvzn#sh inv NAME: "2821 chassis", DESCR: "2821 chassis" <snip> Serial0/0/0:1 is up, line protocol is up Hardware is GT96K Serial Description: MTU 1500 bytes, BW 1536 Kbit/sec, DLY 20000 usec, reliability 255/255, txload 149/255, rxload 15/255 Encapsulation PPP, LCP Open, multilink Open Link is a member of Multilink bundle Multilink1, loopback not set Keepalive set (10 sec) Last input 00:00:00, output 00:00:00, output hang never Last clearing of "show interface" counters 14w0d Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: weighted fair [suspended, using FIFO] FIFO output queue 0/40, 0 drops 30 second input rate 93000 bits/sec, 86 packets/sec 30 second output rate 899000 bits/sec, 122 packets/sec 105433994 packets input, 3520749026 bytes, 0 no buffer Received 0 broadcasts, 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 155813204 packets output, 1174780375 bytes, 0 underruns 0 output errors, 0 collisions, 1 interface resets 0 unknown protocol drops 0 output buffer failures, 0 output buffers swapped out 1 carrier transitions Timeslot(s) Used:1-24, SCC: 0, Transmitter delay is 0 flags Serial0/0/1:1 is up, line protocol is up Hardware is GT96K Serial Description: MTU 1500 bytes, BW 1536 Kbit/sec, DLY 20000 usec, reliability 255/255, txload 149/255, rxload 15/255 Encapsulation PPP, LCP Open, multilink Open Link is a member of Multilink bundle Multilink1, loopback not set Keepalive set (10 sec) Last input 00:00:00, output 00:00:00, output hang never Last clearing of "show interface" counters 14w0d Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: weighted fair [suspended, using FIFO] FIFO output queue 0/40, 0 drops 30 second input rate 94000 bits/sec, 86 packets/sec 30 second output rate 898000 bits/sec, 122 packets/sec 105441924 packets input, 3518841511 bytes, 0 no buffer Received 0 broadcasts, 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 155734625 packets output, 1156759105 bytes, 0 underruns 0 output errors, 0 collisions, 1 interface resets 0 unknown protocol drops 0 output buffer failures, 0 output buffers swapped out 1 carrier transitions Timeslot(s) Used:1-24, SCC: 1, Transmitter delay is 0 flags Multilink1 is up, line protocol is up Hardware is multilink group interface Description: Verizon Business MPLS Circuit Internet address is x.x.x.150/30 MTU 1500 bytes, BW 3072 Kbit/sec, DLY 100000 usec, reliability 255/255, txload 148/255, rxload 14/255 Encapsulation PPP, LCP Open, multilink Open Listen: CDPCP Open: IPCP, loopback not set Keepalive set (10 sec) DTR is pulsed for 2 seconds on reset Last input 00:00:00, output never, output hang never Last clearing of "show interface" counters 14w0d Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 252140 Queueing strategy: Class-based queueing Output queue: 3/1000/0 (size/max total/drops) 30 second input rate 179000 bits/sec, 172 packets/sec 30 second output rate 1795000 bits/sec, 243 packets/sec 207501114 packets input, 1445648459 bytes, 0 no buffer Received 0 broadcasts, 0 runts, 0 giants, 0 throttles 42 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 307484312 packets output, 2277871516 bytes, 0 underruns 0 output errors, 0 collisions, 3 interface resets 0 unknown protocol drops 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions I also ran 6 flows worth of iperf between a server at the site and my laptop while the transfer was running (iperf -i 2 - P 6 -t 120 -c 10.1.150.4) in the same direction core.bvzn#sh policy-map int mu1 Multilink1 Service-policy output: QWAS queue stats for all priority classes: queue limit 64 packets (queue depth/total drops/no-buffer drops) 0/0/0 (pkts output/bytes output) 0/0 Class-map: VoIP (match-any) 0 packets, 0 bytes 30 second offered rate 0 bps, drop rate 0 bps Match: dscp ef (46) 0 packets, 0 bytes 30 second rate 0 bps Priority: 100 kbps, burst bytes 2500, b/w exceed drops: 0 Class-map: interactive (match-any) 31490239 packets, 14882494949 bytes 30 second offered rate 4000 bps, drop rate 0 bps Match: protocol rdesktop 10981329 packets, 1277510597 bytes 30 second rate 3000 bps Match: protocol telnet 1104192 packets, 183832229 bytes 30 second rate 0 bps Match: protocol ssh 9263601 packets, 11659456657 bytes 30 second rate 0 bps Queueing queue limit 64 packets (queue depth/total drops/no-buffer drops) 0/1103/0 (pkts output/bytes output) 31489136/14887505365 bandwidth 500 kbps Class-map: class-default (match-any) 275000011 packets, 120951145536 bytes 30 second offered rate 1494000 bps, drop rate 0 bps Match: any Queueing queue limit 64 packets (queue depth/total drops/no-buffer drops/flowdrops) 0/251092/0/251092 (pkts output/bytes output) 276085337/122442704318 Fair-queue: per-flow queue limit 16 core.bvzn# ----- Andrey Gordon [andrey.gordon@gmail.com]
It could very well be microburst in the flow creating congestion as seen in the default class:
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 252140
30 second output rate 1795000 bits/sec, 243 packets/sec
Class-map: class-default (match-any) 275000011 packets, 120951145536 bytes 30 second offered rate 1494000 bps, drop rate 0 bps Match: any Queueing queue limit 64 packets (queue depth/total drops/no-buffer drops/flowdrops) 0/251092/0/251092 (pkts output/bytes output) 276085337/122442704318 Fair-queue: per-flow queue limit 16
Which matches mostly to the default class. I don't recall if the per flow queue limit kicks in without congestion or not. You could try a few things: a) remove WFQ in the default class b) add a BW statement to it to allocate a dedicated amount c) implement WRED in the class d) remove WFQ in the default class to see if one of those improves it. btw, the overhead I was referring to was the additional MLPPP overhead to each packet which reduces effective throughput. Rodney
participants (5)
-
Andrey Gordon
-
Anton Kapela
-
Dan White
-
Matthew Huff
-
Rodney Dunn