Re: Network Connectivity... Dealing with Providers
Kuechel, Mark wrote:
Sounds like you are trouble shooting a VoIP issue several networks removed from the actual user. First step is to get into their network via telnet and start from there. Is this a jitter issue on some or all calls? Has the customer done a traffic study on their own LAN to see if there is not some sort of congestion there? Pings from afar are not used to trouble shoot issues in depth: Lots of posting on this. Has the clients Bandwidth utilization been looked at to their provider? Give us more.
Pings and traceroutes weren't the only tests I've done. Here is my capacity when dealing with this client: When something happens and I need to do some VoIP related stuff (extension changes, etc), I mainly log in via SSH from one of four points, a DSL connection CTTEL, Level3, GBLX, and Verio. When my lab's CTTel DSL connection fails I jump on a DS3 (GBLX), when that fails, I jump on to a machine in Texas and most of the times one of them is going to let me in. Now, I have had failures from two points to all points at sporadic times. So I do the obvious traceroutes, pings, etc.. Now a provider can be quick to tell me "check your line" but come on now... 4 different lines are failing to connect here. (This doesn't include the fact that if I can't get in... What makes you think voice data is getting in?) So, for my testing, I'm doing a functional (its fugly) test from all four locations to my client, and from my client to all four. My data is going to be a collection of ping tests, traceroute test (tcptraceroute), bing test, etc.... I was hoping to get feedback on other tools... I have Radarping as well but don't feel like using it. I want to be able to leave something running 24x7 until Friday. I'd like for it to be opensource so the provider doesn't cry "your network voodoo tools don't count!". I want to be able to go back and say "Listen these tools are industry standard tools from CAIDA (or elsewhere), and they're used by engineers all across the board. I've done a fair test and its obviously coming from your network.." So to answer your bandwidth question, bandwidth (according to the provider) is under 50% capacity with "sporadic spikes" as their engineers have seen while on the phone with them. Sporadic means nothing to me. I have a 63% packet loss which means even if I was equipped with an OC768, the bandwidth means nothing if the packets aren't going through. "Here's your Lamborghini Murcielago Sir. It does 200mph. Although from time to time you'll only do 126mph..." Traffic internally, I've put on QoS maps, but with or without them same errors occur. It's not an issue of echoes, its more of calls to specific DID's dropping, not going through, caller can hear - receiver can't. All the while some lines work, others don't. Couple this with my Nagios test going bonkers - I configured Nagios to monitor from my client to Google, Yahoo, MSN and I can see loss from here to the outside world so it's twofold. Short of my client running me over with his FX45, I'm even running out of patience with my client's provider. -- =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ J. Oquendo echo @infiltrated|sed 's/^/sil/g;s/$/.net/g' http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1383A743 "How a man plays the game shows something of his character - how he loses shows all" - Mr. Luckey
If you have Cisco routers on either end, use the built in SLA capability. It will give you ongoing abilty to trace latency, loss, jitter. It won't tell you bandwidth, but will give you a set of metrics for traffic quality. Do a full mesh between all your edge devices and it might help track where in the middle your issues reside. The SLA tools are pretty standard to Cisco devices and so should give you an edge in getting people to listen to you.
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of J. Oquendo Sent: Wednesday, November 15, 2006 16:20 To: Kuechel, Mark Cc: nanog@merit.edu Subject: Re: Network Connectivity... Dealing with Providers
Sounds like you are trouble shooting a VoIP issue several networks removed from the actual user. First step is to get into
Kuechel, Mark wrote: their network
via telnet and start from there. Is this a jitter issue on some or all calls? Has the customer done a traffic study on their own LAN to see if there is not some sort of congestion there? Pings from afar are not used to trouble shoot issues in depth: Lots of posting on this. Has the clients Bandwidth utilization been looked at to their provider? Give us more.
Pings and traceroutes weren't the only tests I've done. Here is my capacity when dealing with this client:
When something happens and I need to do some VoIP related stuff (extension changes, etc), I mainly log in via SSH from one of four points, a DSL connection CTTEL, Level3, GBLX, and Verio. When my lab's CTTel DSL connection fails I jump on a DS3 (GBLX), when that fails, I jump on to a machine in Texas and most of the times one of them is going to let me in. Now, I have had failures from two points to all points at sporadic times. So I do the obvious traceroutes, pings, etc.. Now a provider can be quick to tell me "check your line" but come on now... 4 different lines are failing to connect here. (This doesn't include the fact that if I can't get in... What makes you think voice data is getting in?)
So, for my testing, I'm doing a functional (its fugly) test from all four locations to my client, and from my client to all four. My data is going to be a collection of ping tests, traceroute test (tcptraceroute), bing test, etc.... I was hoping to get feedback on other tools... I have Radarping as well but don't feel like using it. I want to be able to leave something running 24x7 until Friday. I'd like for it to be opensource so the provider doesn't cry "your network voodoo tools don't count!". I want to be able to go back and say "Listen these tools are industry standard tools from CAIDA (or elsewhere), and they're used by engineers all across the board. I've done a fair test and its obviously coming from your network.."
So to answer your bandwidth question, bandwidth (according to the provider) is under 50% capacity with "sporadic spikes" as their engineers have seen while on the phone with them. Sporadic means nothing to me. I have a 63% packet loss which means even if I was equipped with an OC768, the bandwidth means nothing if the packets aren't going through. "Here's your Lamborghini Murcielago Sir. It does 200mph. Although from time to time you'll only do 126mph..." Traffic internally, I've put on QoS maps, but with or without them same errors occur. It's not an issue of echoes, its more of calls to specific DID's dropping, not going through, caller can hear - receiver can't. All the while some lines work, others don't. Couple this with my Nagios test going bonkers - I configured Nagios to monitor from my client to Google, Yahoo, MSN and I can see loss from here to the outside world so it's twofold. Short of my client running me over with his FX45, I'm even running out of patience with my client's provider.
-- =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ J. Oquendo echo @infiltrated|sed 's/^/sil/g;s/$/.net/g' http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1383A743
"How a man plays the game shows something of his character - how he loses shows all" - Mr. Luckey
-- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
-- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
Ray, Do you have an example of accessing the SLA data via SNMP? I've just got interested in those things, I've found the OIDs required, but its all a bit of a maze ... I could really use some jitter information in a couple of places right about now ... Neal Ray Burkholder wrote:
If you have Cisco routers on either end, use the built in SLA capability. It will give you ongoing abilty to trace latency, loss, jitter. It won't tell you bandwidth, but will give you a set of metrics for traffic quality. Do a full mesh between all your edge devices and it might help track where in the middle your issues reside. The SLA tools are pretty standard to Cisco devices and so should give you an edge in getting people to listen to you.
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of J. Oquendo Sent: Wednesday, November 15, 2006 16:20 To: Kuechel, Mark Cc: nanog@merit.edu Subject: Re: Network Connectivity... Dealing with Providers
Kuechel, Mark wrote:
Sounds like you are trouble shooting a VoIP issue several networks removed from the actual user. First step is to get into
their network
via telnet and start from there. Is this a jitter issue on
some or all
calls? Has the customer done a traffic study on their own
LAN to see
if there is not some sort of congestion there? Pings from
afar are not
used to trouble shoot issues in depth: Lots of posting on this. Has the clients Bandwidth utilization been looked at to their provider? Give us more.
Pings and traceroutes weren't the only tests I've done. Here is my capacity when dealing with this client:
When something happens and I need to do some VoIP related stuff (extension changes, etc), I mainly log in via SSH from one of four points, a DSL connection CTTEL, Level3, GBLX, and Verio. When my lab's CTTel DSL connection fails I jump on a DS3 (GBLX), when that fails, I jump on to a machine in Texas and most of the times one of them is going to let me in. Now, I have had failures from two points to all points at sporadic times. So I do the obvious traceroutes, pings, etc.. Now a provider can be quick to tell me "check your line" but come on now... 4 different lines are failing to connect here. (This doesn't include the fact that if I can't get in... What makes you think voice data is getting in?)
So, for my testing, I'm doing a functional (its fugly) test from all four locations to my client, and from my client to all four. My data is going to be a collection of ping tests, traceroute test (tcptraceroute), bing test, etc.... I was hoping to get feedback on other tools... I have Radarping as well but don't feel like using it. I want to be able to leave something running 24x7 until Friday. I'd like for it to be opensource so the provider doesn't cry "your network voodoo tools don't count!". I want to be able to go back and say "Listen these tools are industry standard tools from CAIDA (or elsewhere), and they're used by engineers all across the board. I've done a fair test and its obviously coming from your network.."
So to answer your bandwidth question, bandwidth (according to the provider) is under 50% capacity with "sporadic spikes" as their engineers have seen while on the phone with them. Sporadic means nothing to me. I have a 63% packet loss which means even if I was equipped with an OC768, the bandwidth means nothing if the packets aren't going through. "Here's your Lamborghini Murcielago Sir. It does 200mph. Although from time to time you'll only do 126mph..." Traffic internally, I've put on QoS maps, but with or without them same errors occur. It's not an issue of echoes, its more of calls to specific DID's dropping, not going through, caller can hear - receiver can't. All the while some lines work, others don't. Couple this with my Nagios test going bonkers - I configured Nagios to monitor from my client to Google, Yahoo, MSN and I can see loss from here to the outside world so it's twofold. Short of my client running me over with his FX45, I'm even running out of patience with my client's provider.
-- =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ J. Oquendo echo @infiltrated|sed 's/^/sil/g;s/$/.net/g' http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1383A743
"How a man plays the game shows something of his character - how he loses shows all" - Mr. Luckey
-- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
I've been using Cricket along with GenDevConfig_2_0 from http://www.acktomic.com/cricket/cricket-genRtrConfig.htm to collect and plot cisco SAL status. I have had to make some changes to their scripts to accept some of Cisco's recent changes. I can get the changes posted in the next day or two. > -----Original Message----- > From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On > Behalf Of nealr > Sent: Wednesday, November 15, 2006 17:58 > To: nanog@merit.edu > Subject: Cisco SLA data access via SNMP? > > > > Ray, > > Do you have an example of accessing the SLA data via SNMP? > I've just got interested in those things, I've found the OIDs > required, but its all a bit of a maze ... I could really use > some jitter information in a couple of places right about now ... > > > Neal > > Ray Burkholder wrote: > > If you have Cisco routers on either end, use the built in > SLA capability. > > It will give you ongoing abilty to trace latency, loss, jitter. It > > won't tell you bandwidth, but will give you a set of > metrics for traffic quality. > > Do a full mesh between all your edge devices and it might > help track > > where in the middle your issues reside. The SLA tools are pretty > > standard to Cisco devices and so should give you an edge in getting > > people to listen to you. > > > > > >> -----Original Message----- > >> From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] > On Behalf > >> Of J. Oquendo > >> Sent: Wednesday, November 15, 2006 16:20 > >> To: Kuechel, Mark > >> Cc: nanog@merit.edu > >> Subject: Re: Network Connectivity... Dealing with Providers > >> > >> > >> Kuechel, Mark wrote: > >> > >>> Sounds like you are trouble shooting a VoIP issue several > networks > >>> removed from the actual user. First step is to get into > >>> > >> their network > >> > >>> via telnet and start from there. Is this a jitter issue on > >>> > >> some or all > >> > >>> calls? Has the customer done a traffic study on their own > >>> > >> LAN to see > >> > >>> if there is not some sort of congestion there? Pings from > >>> > >> afar are not > >> > >>> used to trouble shoot issues in depth: Lots of posting on > this. Has > >>> the clients Bandwidth utilization been looked at to their > provider? > >>> Give us more. > >>> > >>> > >> Pings and traceroutes weren't the only tests I've done. Here is my > >> capacity when dealing with this client: > >> > >> When something happens and I need to do some VoIP related stuff > >> (extension changes, etc), I mainly log in via SSH from one of four > >> points, a DSL connection CTTEL, Level3, GBLX, and Verio. When my > >> lab's CTTel DSL connection fails I jump on a > >> DS3 (GBLX), when that fails, I jump on to a machine in > Texas and most > >> of the times one of them is going to let me in. Now, I have had > >> failures from two points to all points at sporadic times. > So I do the > >> obvious traceroutes, pings, etc.. Now a provider can be > quick to tell > >> me "check your line" but come on now... 4 different lines > are failing > >> to connect here. > >> (This doesn't include the fact that if I can't get in... > What makes > >> you think voice data is getting in?) > >> > >> So, for my testing, I'm doing a functional (its fugly) > test from all > >> four locations to my client, and from my client to all > four. My data > >> is going to be a collection of ping tests, traceroute test > >> (tcptraceroute), bing test, etc.... I was hoping to get > feedback on > >> other tools... I have Radarping as well but don't feel > like using it. > >> I want to be able to leave something running 24x7 until > Friday. I'd > >> like for it to be opensource so the provider doesn't cry "your > >> network voodoo tools don't count!". I want to be able to > go back and > >> say "Listen these tools are industry standard tools from CAIDA (or > >> elsewhere), and they're used by engineers all across the > board. I've > >> done a fair test and its obviously coming from your network.." > >> > >> So to answer your bandwidth question, bandwidth (according to the > >> provider) is under 50% capacity with "sporadic spikes" as their > >> engineers have seen while on the phone with them. > >> Sporadic means nothing to me. I have a 63% packet loss which means > >> even if I was equipped with an OC768, the bandwidth means > nothing if > >> the packets aren't going through. "Here's your Lamborghini > Murcielago > >> Sir. It does 200mph. Although from time to time you'll only do > >> 126mph..." Traffic internally, I've put on QoS maps, but with or > >> without them same errors occur. It's not an issue of > echoes, its more > >> of calls to specific DID's dropping, not going through, caller can > >> hear - receiver can't. All the while some lines work, > others don't. > >> Couple this with my Nagios test going bonkers - I > configured Nagios > >> to monitor from my client to Google, Yahoo, MSN and I can see loss > >> from here to the outside world so it's twofold. Short of my client > >> running me over with his FX45, I'm even running out of > patience with > >> my client's provider. > >> > >> > >> -- > >> =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ > >> J. Oquendo > >> echo @infiltrated|sed 's/^/sil/g;s/$/.net/g' > >> http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1383A743 > >> > >> "How a man plays the game shows something of his character > - how he > >> loses shows all" - Mr. Luckey > >> > >> -- > >> Scanned for viruses and dangerous content at > >> http://www.oneunified.net and is believed to be clean. > >> > >> > >> > > > > > > > > > -- > Scanned for viruses and dangerous content at > http://www.oneunified.net and is believed to be clean. > > -- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
nealr wrote:
Ray,
Do you have an example of accessing the SLA data via SNMP? I've just got interested in those things, I've found the OIDs required, but its all a bit of a maze ... I could really use some jitter information in a couple of places right about now ...
I seem to remember the thread http://forums.cacti.net/about4136-0-asc-30.html as being useful if you use cacti.
Neal
Ray,
Do you have an example of accessing the SLA data via SNMP? I've just got interested in those things, I've found the OIDs required, but its all a bit of a maze ... I could really use some jitter information in a couple of places right about now ...
A number of people have asked for how I did the Cricket/SLA thing. I have a description of the configuration at: http://www.oneunified.net/blog/OpenSource/Debian/Monitoring/Cricket/installa ndconfig.article On one of the systems I'm getting a cricket error of: "illegal attempt to update +using time 1163791808 when last update time is 1163791808 (minimum one second step) " I'm not sure if it affects other systems. I have to check. Anyway, once I get this thing fixed, I think everything should be good to go. Let me know if you have similar problems. Ray http://www.oneunified.net/blog/ -- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
Ray Burkholder wrote:
If you have Cisco routers on either end, use the built in SLA capability. It will give you ongoing abilty to trace latency, loss, jitter. It won't tell you bandwidth, but will give you a set of metrics for traffic quality. Do a full mesh between all your edge devices and it might help track where in the middle your issues reside. The SLA tools are pretty standard to Cisco devices and so should give you an edge in getting people to listen to you.
Thanks for all the responses. I wish I had Cisco on both ends I would have configured auto-qos but I'm stuck on Adtran (client) and I believe Juniper (provider). Anyhow for those who enquired, this is what I am currently doing for my connectivity testing: (M = my connections, C = client) M(GBLX) --> tcptraceroute && iperf && ping --> Client M(LVLT) --> same as above --> Client M(DSL) --> same as above --> Client M(Verio) -- > same as above --> Client C --> bing (Google, MSN, *PROVIDER*) && tcptraceroute --> M(GBLX) C --> tcptraceroute --> M(LVLT) C --> tcptraceroute --> M(DSL) C --> tcptraceroute --> M(Verio) So far I have come across the following oddity I can't put my finger on: # bing -P -D -c 25 -e 3 xxx.xxx.1.177 xxx.xxx.1.182 bing: packet (72 bytes) from unexpected host xxx.xxx.24.36 bing: packet (72 bytes) from unexpected host xxx.xxx.24.36 bing: packet (72 bytes) from unexpected host xxx.xxx.24.36 bing: packet (136 bytes) from unexpected host xxx.xxx.24.36 bing: packet (136 bytes) from unexpected host xxx.xxx.24.36 bing: packet (136 bytes) from unexpected host xxx.xxx.24.36 bing: packet (72 bytes) from unexpected host xxx.xxx.24.36 bing: packet (72 bytes) from unexpected host xxx.xxx.24.36 bing: packet (72 bytes) from unexpected host xxx.xxx.24.36 See a problem? xxx.xxx.24.36 is the provider's router two hops before the CPE. I'm thinking, filtering? Maybe, I have no idea why xxx.xxx.24.36 is getting in the mix of my packets. I have this scenario running every 15 minutes from all locations. -- ==================================================== J. Oquendo http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1383A743 sil . infiltrated @ net http://www.infiltrated.net The happiness of society is the end of government. John Adams
participants (4)
-
J. Oquendo
-
nealr
-
Ray Burkholder
-
Vince