Issues with SIP packets between VZ Fios and NTT
Hi, Over the last 48 hours we have been getting a lot of alerts of customers phones losing registrations to us. All the complaints are coming from customers that are on VZ Fios in the NYC area. Anyone else see anything strange going on? TIA. Dovid
On 5/13/2019 9:21 AM, Dovid Bender wrote:
Hi,
Over the last 48 hours we have been getting a lot of alerts of customers phones losing registrations to us. All the complaints are coming from customers that are on VZ Fios in the NYC area. Anyone else see anything strange going on?
While you are diagnosing, might check to make sure that the SIP ALG is disabled on all of their routers too. -- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
Thought of that. Customers have their own CPE's. So far the only thing mutual here is that it's NTT -> VZ. Here is what I found so far looking at two Polycom phones using non standard ports (e.g. not 5060) 1) PhoneA tries to register multiple extensions and for each request we send a 401. We expect to get back a REGISTER request with a no-once but we don't. This happens for a while and then magically it starts working. 2) PhoneB tries to register the time time as PhoneA and has no issues. At first I thought it was something possibly with the SIP call-ID but I ruled that out since in the same SIP DIALOG it was not working then it started. Also the seems to be per phone each phone is behind NAT and the traffic is coming from a different NAT'd port. Seems like there is some device in the middle that is randomly dropping traffic on specific sessions. On Mon, May 13, 2019 at 11:40 AM Brielle Bruns <bruns@2mbit.com> wrote:
On 5/13/2019 9:21 AM, Dovid Bender wrote:
Hi,
Over the last 48 hours we have been getting a lot of alerts of customers phones losing registrations to us. All the complaints are coming from customers that are on VZ Fios in the NYC area. Anyone else see anything strange going on?
While you are diagnosing, might check to make sure that the SIP ALG is disabled on all of their routers too.
-- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
On 5/13/2019 10:20 AM, Dovid Bender wrote:
Thought of that. Customers have their own CPE's. So far the only thing mutual here is that it's NTT -> VZ. Here is what I found so far looking at two Polycom phones using non standard ports (e.g. not 5060) 1) PhoneA tries to register multiple extensions and for each request we send a 401. We expect to get back a REGISTER request with a no-once but we don't. This happens for a while and then magically it starts working. 2) PhoneB tries to register the time time as PhoneA and has no issues.
At first I thought it was something possibly with the SIP call-ID but I ruled that out since in the same SIP DIALOG it was not working then it started. Also the seems to be per phone each phone is behind NAT and the traffic is coming from a different NAT'd port. Seems like there is some device in the middle that is randomly dropping traffic on specific sessions.
Are you using TLS encrypted SIP or just plain ol' cleartext? If its encrypted, I'd look at possibly there being a MTU/MSS issue somewhere along the path possibly? -- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
Good ol UDP encrypted. On Mon, May 13, 2019 at 12:49 PM Brielle Bruns <bruns@2mbit.com> wrote:
On 5/13/2019 10:20 AM, Dovid Bender wrote:
Thought of that. Customers have their own CPE's. So far the only thing mutual here is that it's NTT -> VZ. Here is what I found so far looking at two Polycom phones using non standard ports (e.g. not 5060) 1) PhoneA tries to register multiple extensions and for each request we send a 401. We expect to get back a REGISTER request with a no-once but we don't. This happens for a while and then magically it starts working. 2) PhoneB tries to register the time time as PhoneA and has no issues.
At first I thought it was something possibly with the SIP call-ID but I ruled that out since in the same SIP DIALOG it was not working then it started. Also the seems to be per phone each phone is behind NAT and the traffic is coming from a different NAT'd port. Seems like there is some device in the middle that is randomly dropping traffic on specific sessions.
Are you using TLS encrypted SIP or just plain ol' cleartext?
If its encrypted, I'd look at possibly there being a MTU/MSS issue somewhere along the path possibly?
-- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
FYI: More than one person reached out to me off list. The issue is clearly with VZ. Traces by the others were done and NTT was not in the mix. The only common denominator was 401 SIP packets hitting VZ Fios IP's in the NY area. On Mon, May 13, 2019 at 1:04 PM Dovid Bender <dovid@telecurve.com> wrote:
Good ol UDP encrypted.
On Mon, May 13, 2019 at 12:49 PM Brielle Bruns <bruns@2mbit.com> wrote:
Thought of that. Customers have their own CPE's. So far the only thing mutual here is that it's NTT -> VZ. Here is what I found so far looking at two Polycom phones using non standard ports (e.g. not 5060) 1) PhoneA tries to register multiple extensions and for each request we send a 401. We expect to get back a REGISTER request with a no-once but we don't. This happens for a while and then magically it starts working. 2) PhoneB tries to register the time time as PhoneA and has no issues.
At first I thought it was something possibly with the SIP call-ID but I ruled that out since in the same SIP DIALOG it was not working then it started. Also the seems to be per phone each phone is behind NAT and
On 5/13/2019 10:20 AM, Dovid Bender wrote: the
traffic is coming from a different NAT'd port. Seems like there is some device in the middle that is randomly dropping traffic on specific sessions.
Are you using TLS encrypted SIP or just plain ol' cleartext?
If its encrypted, I'd look at possibly there being a MTU/MSS issue somewhere along the path possibly?
-- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
This matches my experience with running SIP on networks. Slowly over the years it became more unreliable as “helper” ALGs were in the path. Eventually we moved some devices off 5060 to alleviate the problem. Sent from my iCar
On May 13, 2019, at 2:32 PM, Dovid Bender <dovid@telecurve.com> wrote:
FYI: More than one person reached out to me off list. The issue is clearly with VZ. Traces by the others were done and NTT was not in the mix. The only common denominator was 401 SIP packets hitting VZ Fios IP's in the NY area.
On Mon, May 13, 2019 at 1:04 PM Dovid Bender <dovid@telecurve.com> wrote: Good ol UDP encrypted.
On Mon, May 13, 2019 at 12:49 PM Brielle Bruns <bruns@2mbit.com> wrote:
On 5/13/2019 10:20 AM, Dovid Bender wrote:
Thought of that. Customers have their own CPE's. So far the only thing mutual here is that it's NTT -> VZ. Here is what I found so far looking at two Polycom phones using non standard ports (e.g. not 5060) 1) PhoneA tries to register multiple extensions and for each request we send a 401. We expect to get back a REGISTER request with a no-once but we don't. This happens for a while and then magically it starts working. 2) PhoneB tries to register the time time as PhoneA and has no issues.
At first I thought it was something possibly with the SIP call-ID but I ruled that out since in the same SIP DIALOG it was not working then it started. Also the seems to be per phone each phone is behind NAT and the traffic is coming from a different NAT'd port. Seems like there is some device in the middle that is randomly dropping traffic on specific sessions.
Are you using TLS encrypted SIP or just plain ol' cleartext?
If its encrypted, I'd look at possibly there being a MTU/MSS issue somewhere along the path possibly?
-- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
In our case we are not using 5060. The issue seems exclusive to VZ. On Mon, May 13, 2019 at 2:43 PM Jared Mauch <jared@puck.nether.net> wrote:
This matches my experience with running SIP on networks. Slowly over the years it became more unreliable as “helper” ALGs were in the path.
Eventually we moved some devices off 5060 to alleviate the problem.
Sent from my iCar
On May 13, 2019, at 2:32 PM, Dovid Bender <dovid@telecurve.com> wrote:
FYI: More than one person reached out to me off list. The issue is clearly with VZ. Traces by the others were done and NTT was not in the mix. The only common denominator was 401 SIP packets hitting VZ Fios IP's in the NY area.
On Mon, May 13, 2019 at 1:04 PM Dovid Bender <dovid@telecurve.com> wrote:
Good ol UDP encrypted.
On Mon, May 13, 2019 at 12:49 PM Brielle Bruns <bruns@2mbit.com> wrote:
Thought of that. Customers have their own CPE's. So far the only thing mutual here is that it's NTT -> VZ. Here is what I found so far looking at two Polycom phones using non standard ports (e.g. not 5060) 1) PhoneA tries to register multiple extensions and for each request we send a 401. We expect to get back a REGISTER request with a no-once but we don't. This happens for a while and then magically it starts working. 2) PhoneB tries to register the time time as PhoneA and has no issues.
At first I thought it was something possibly with the SIP call-ID but I ruled that out since in the same SIP DIALOG it was not working then it started. Also the seems to be per phone each phone is behind NAT and
On 5/13/2019 10:20 AM, Dovid Bender wrote: the
traffic is coming from a different NAT'd port. Seems like there is some device in the middle that is randomly dropping traffic on specific sessions.
Are you using TLS encrypted SIP or just plain ol' cleartext?
If its encrypted, I'd look at possibly there being a MTU/MSS issue somewhere along the path possibly?
-- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
Dovid Bender, I'm seeing the same sort of thing. Polycom phones. Multiple customers getting to me from Verizon in NYC area. I'm seeing phones register for a while, then drop off, then I see them trying to re-reg resulting in your 401 below. Call me. 212 497 8015. Let's look at this. Pete Pete Rohrman Stage2 Support 212 497 8000, Opt. 2 On 5/13/19 12:20 PM, Dovid Bender wrote:
Thought of that. Customers have their own CPE's. So far the only thing mutual here is that it's NTT -> VZ. Here is what I found so far looking at two Polycom phones using non standard ports (e.g. not 5060) 1) PhoneA tries to register multiple extensions and for each request we send a 401. We expect to get back a REGISTER request with a no-once but we don't. This happens for a while and then magically it starts working. 2) PhoneB tries to register the time time as PhoneA and has no issues.
At first I thought it was something possibly with the SIP call-ID but I ruled that out since in the same SIP DIALOG it was not working then it started. Also the seems to be per phone each phone is behind NAT and the traffic is coming from a different NAT'd port. Seems like there is some device in the middle that is randomly dropping traffic on specific sessions.
On Mon, May 13, 2019 at 11:40 AM Brielle Bruns <bruns@2mbit.com <mailto:bruns@2mbit.com>> wrote:
On 5/13/2019 9:21 AM, Dovid Bender wrote: > Hi, > > Over the last 48 hours we have been getting a lot of alerts of customers > phones losing registrations to us. All the complaints are coming from > customers that are on VZ Fios in the NYC area. Anyone else see anything > strange going on? >
While you are diagnosing, might check to make sure that the SIP ALG is disabled on all of their routers too.
-- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
Can someone try to recreate the problem with TCP/5060. Or do iperf test on equivalent ports with UDP+TCP, to determine if the problem is related specifically to UDP. Most networks have some form of limits to even transit traffic, UDP is most typical L4 to have policers. On Tue, 14 May 2019 at 00:12, Pete Rohrman <prohrman@stage2networks.com> wrote:
Dovid Bender,
I'm seeing the same sort of thing. Polycom phones. Multiple customers getting to me from Verizon in NYC area. I'm seeing phones register for a while, then drop off, then I see them trying to re-reg resulting in your 401 below.
Call me. 212 497 8015. Let's look at this.
Pete
Pete Rohrman Stage2 Support 212 497 8000, Opt. 2
On 5/13/19 12:20 PM, Dovid Bender wrote:
Thought of that. Customers have their own CPE's. So far the only thing mutual here is that it's NTT -> VZ. Here is what I found so far looking at two Polycom phones using non standard ports (e.g. not 5060) 1) PhoneA tries to register multiple extensions and for each request we send a 401. We expect to get back a REGISTER request with a no-once but we don't. This happens for a while and then magically it starts working. 2) PhoneB tries to register the time time as PhoneA and has no issues.
At first I thought it was something possibly with the SIP call-ID but I ruled that out since in the same SIP DIALOG it was not working then it started. Also the seems to be per phone each phone is behind NAT and the traffic is coming from a different NAT'd port. Seems like there is some device in the middle that is randomly dropping traffic on specific sessions.
On Mon, May 13, 2019 at 11:40 AM Brielle Bruns <bruns@2mbit.com> wrote:
On 5/13/2019 9:21 AM, Dovid Bender wrote:
Hi,
Over the last 48 hours we have been getting a lot of alerts of customers phones losing registrations to us. All the complaints are coming from customers that are on VZ Fios in the NYC area. Anyone else see anything strange going on?
While you are diagnosing, might check to make sure that the SIP ALG is disabled on all of their routers too.
-- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
-- ++ytti
It's not strictly UDP. I spoke with someone yesterday that was re-producing it with curl. On Tue, May 14, 2019 at 2:04 AM Saku Ytti <saku@ytti.fi> wrote:
Can someone try to recreate the problem with TCP/5060. Or do iperf test on equivalent ports with UDP+TCP, to determine if the problem is related specifically to UDP.
Most networks have some form of limits to even transit traffic, UDP is most typical L4 to have policers.
On Tue, 14 May 2019 at 00:12, Pete Rohrman <prohrman@stage2networks.com> wrote:
Dovid Bender,
I'm seeing the same sort of thing. Polycom phones. Multiple
customers getting to me from Verizon in NYC area. I'm seeing phones register for a while, then drop off, then I see them trying to re-reg resulting in your 401 below.
Call me. 212 497 8015. Let's look at this.
Pete
Pete Rohrman Stage2 Support 212 497 8000, Opt. 2
On 5/13/19 12:20 PM, Dovid Bender wrote:
Thought of that. Customers have their own CPE's. So far the only thing
1) PhoneA tries to register multiple extensions and for each request we send a 401. We expect to get back a REGISTER request with a no-once but we don't. This happens for a while and then magically it starts working. 2) PhoneB tries to register the time time as PhoneA and has no issues.
At first I thought it was something possibly with the SIP call-ID but I ruled that out since in the same SIP DIALOG it was not working then it started. Also the seems to be per phone each phone is behind NAT and the
mutual here is that it's NTT -> VZ. Here is what I found so far looking at two Polycom phones using non standard ports (e.g. not 5060) traffic is coming from a different NAT'd port. Seems like there is some device in the middle that is randomly dropping traffic on specific sessions.
On Mon, May 13, 2019 at 11:40 AM Brielle Bruns <bruns@2mbit.com> wrote:
On 5/13/2019 9:21 AM, Dovid Bender wrote:
Hi,
Over the last 48 hours we have been getting a lot of alerts of
customers
phones losing registrations to us. All the complaints are coming from customers that are on VZ Fios in the NYC area. Anyone else see anything strange going on?
While you are diagnosing, might check to make sure that the SIP ALG is disabled on all of their routers too.
-- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
-- ++ytti
participants (5)
-
Brielle Bruns
-
Dovid Bender
-
Jared Mauch
-
Pete Rohrman
-
Saku Ytti