Jon, I saw something like this once, and it was a router bug. Your ping trace seems to be saying that after the first ping something gets stuck, and ten more pings accumulate in a queue somewhere until a ten-second timer expires, or maybe ten messages in a queue is a maximum that triggers some action. Then the queued-up pings are all released at once and after that things run normally. Very strange, but I'll bet it's some peculiarity deep in the guts of whatever box you're pinging. Maybe a process stops running and gets restarted by an inactivity timer. The ISP might be able to find something in a log file or system trace that would be a clue to someone familiar with the router's internals. Did they just have an equipment change or a software upgrade? It doesn't feel like a link outage or quality problem because it always starts after the first ping and lasts for exactly ten more. I also wonder if the ping anomaly is even the same problem as the one that's causing the poor performance you're observing on your benchmarks. Cheers, and good luck! John Renwick
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu]On Behalf Of Jon.Kibler@aset.com Sent: Tuesday, March 30, 2004 6:08 AM To: nanog@merit.edu Subject: DSL and/or Routing Problems
Greetings NANOGers,
Yesterday we starting noticing long delays on an ADSL connection. I spent most of the day trying to track down the problem and getting no where. Telco says they do not detect any problem on the line... so I am kind of lost. Anyone here have any ideas? Here are the specifics:
This connection uses a Cisco 827 ADSL router and has several static IPs. All IPs show identical delays. Using other circuits between the same two locations, we do not see any delays.
Normally on this DSL connection, local can ping remote with packet transit times around 60-70ms. Here is what we are seeing now:
# ping -s SOMEHOST 68 25; sleep 1; ping -s SOMEHOST 68 25 PING SOMEHOST: 68 data bytes 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=0. time=105. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=1. time=9132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=2. time=8132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=3. time=7132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=4. time=6132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=5. time=5133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=6. time=4133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=7. time=3133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=8. time=2133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=9. time=1133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=10. time=133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=11. time=104. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=12. time=110. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=13. time=109. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=14. time=112. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=15. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=16. time=114. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=17. time=107. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=18. time=109. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=19. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=20. time=112. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=21. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=22. time=108. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=23. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=24. time=110. ms
----SOMEHOST PING Statistics---- 25 packets transmitted, 25 packets received, 0% packet loss round-trip (ms) min/avg/max = 104/1918/9132 PING SOMEHOST: 68 data bytes 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=0. time=112. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=1. time=9131. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=2. time=8132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=3. time=7132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=4. time=6132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=5. time=5132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=6. time=4133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=7. time=3132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=8. time=2133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=9. time=1133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=10. time=133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=11. time=111. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=12. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=13. time=109. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=14. time=116. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=15. time=108. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=16. time=107. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=17. time=113. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=18. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=19. time=107. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=20. time=108. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=21. time=108. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=22. time=105. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=23. time=109. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=24. time=106. ms
----SOMEHOST PING Statistics---- 25 packets transmitted, 25 packets received, 0% packet loss round-trip (ms) min/avg/max = 105/1918/9131
What really has me bugged is the pattern shown by the first dozen packets... why the relatively quick first time, followed by a long but decreasing delay which repeats every time you restart the ping (that's why I provided 2 samples)?
Despite the fact that Telco says there are not any line problems, we are seeing a change in DSL performance compared to our benchmark. When we first started noticing the problem yesterday, both in and out connections were using the Fast path, but compared to the benchmark, the inbound speed had dropped to 576 and the Capacity had jumped to 99%, plus we had some RS and CRC errors on both in and out connections. Later in the day, the connection switched from using the Fast path to the Interleave path (we did nothing on our end to cause this to change) and the performance settled down to what is shown below under "DSL NOW."
DSL BENCHMARK: ============== ATU-R (DS) ATU-C (US) Capacity Used: 72% 21%
Interleave Fast Interleave Fast Speed (kbps): 0 960 0 256 Reed-Solomon EC: 0 0 0 0 CRC Errors: 0 0 0 0 Header Errors: 0 0 0 0 Bit Errors: 0 0 BER Valid sec: 0 0 BER Invalid sec: 0 0
DSL NOW: ======== ATU-R (DS) ATU-C (US) Capacity Used: 94% 63%
Interleave Fast Interleave Fast Speed (kbps): 736 0 256 0 Reed-Solomon EC: 99 0 4 0 CRC Errors: 4 0 1 0 Header Errors: 3 0 0 0 Bit Errors: 0 0 BER Valid sec: 0 0 BER Invalid sec: 0 0
This leaves me with a few questions... 1) What exactly is the difference between the Fast and Interleave paths? 2) At the same data rates, should both Fast and Interleave show similar throughput? 3) What would cause a switch from Fast to Interleave to occur? 4) If the Telco cannot find any line problems, what can account for the increased capacity use, the decreased data rates, and the data errors?
And the bottom line is: Why the bizarre ping times, and could the changes observed in DSL performance account for that... plus the near doubling of average ping transit times?
TIA for everyone's expert thoughts on this problem!
Jon R. Kibler A.S.E.T., Inc. Charleston, SC USA (843) 849-8214
================================================== Filtered by: TRUSTEM.COM's Email Filtering Service http://www.trustem.com/ No Spam. No Viruses. Just Good Clean Email.