GBLX router upgrade breaks bgp sessions
Subject says it all. GBLX upgraded some edge routers to a new JunOS release (possibly 5.3 rev 24)- and now our bgp sessions continually reset with: Jul 10 06:58:24 MST: %BGP-3-NOTIFICATION: sent to neighbor X.X.X.X 3/3 (update missing required attributes) 0 bytes Anyone clueful at GBLX listening? We've been down for about 4 hours, and the NOC (call center) people are less than helpful. bill
On Wed, Jul 10, 2002 at 07:04:38AM -0700, nanog wrote:
Subject says it all. GBLX upgraded some edge routers to a new JunOS release (possibly 5.3 rev 24)- and now our bgp sessions continually reset with:
Jul 10 06:58:24 MST: %BGP-3-NOTIFICATION: sent to neighbor X.X.X.X 3/3 (update missing required attributes) 0 bytes
I don't know about gblx, but I saw a problem like this at our border. After JunOS was upgraded to 5.3r2.4 (other side IOS) the session was continually being reset. The bgp session between theser two peers was setup with family inet any (for multicast peering) and when that was removed, the problem went away. I also heard about a problem that may be related I2 was having with their Juniper code, it sounded related, but I haven't investigated the details yet. John
On Wed, Jul 10, 2002 at 09:17:56AM -0500, John Kristoff wrote:
On Wed, Jul 10, 2002 at 07:04:38AM -0700, nanog wrote:
Subject says it all. GBLX upgraded some edge routers to a new JunOS release (possibly 5.3 rev 24)- and now our bgp sessions continually reset with:
Jul 10 06:58:24 MST: %BGP-3-NOTIFICATION: sent to neighbor X.X.X.X 3/3 (update missing required attributes) 0 bytes
I don't know about gblx, but I saw a problem like this at our border. After JunOS was upgraded to 5.3r2.4 (other side IOS) the session was continually being reset. The bgp session between theser two peers was setup with family inet any (for multicast peering) and when that was removed, the problem went away. I also heard about a problem that may be related I2 was having with their Juniper code, it sounded related, but I haven't investigated the details yet.
John
That was it- A quick TAC case later (about 10 minutes turnaround from problem submission to resolution- upgrade IOS or remove multicast from bgp peer) and the problem is fixed. I removed multicast since it was not required on this peer, and will schedule the IOS upgrade during a more friendly maintenance window. GBLX, however, has not returned my call since I opened a high priority, customer down ticket about 1.5 hours ago. Like all other support calls to their NOC, this seems to have disappeared into nevernever land. I love the GBLX network when it works, but god help you if you ever need to talk to a clueful NOC person to fix a problem (especially after hours.) bill
Can you provide any details as to why you had to "remove multicast" - do you mean, remove MBGP ? Or is there more? nanog wrote:
On Wed, Jul 10, 2002 at 09:17:56AM -0500, John Kristoff wrote:
On Wed, Jul 10, 2002 at 07:04:38AM -0700, nanog wrote:
Subject says it all. GBLX upgraded some edge routers to a new JunOS release (possibly 5.3 rev 24)- and now our bgp sessions continually reset with:
Jul 10 06:58:24 MST: %BGP-3-NOTIFICATION: sent to neighbor X.X.X.X 3/3 (update missing required attributes) 0 bytes
I don't know about gblx, but I saw a problem like this at our border. After JunOS was upgraded to 5.3r2.4 (other side IOS) the session was continually being reset. The bgp session between theser two peers was setup with family inet any (for multicast peering) and when that was removed, the problem went away. I also heard about a problem that may be related I2 was having with their Juniper code, it sounded related, but I haven't investigated the details yet.
John
That was it- A quick TAC case later (about 10 minutes turnaround from problem submission to resolution- upgrade IOS or remove multicast from bgp peer) and the problem is fixed. I removed multicast since it was not required on this peer, and will schedule the IOS upgrade during a more friendly maintenance window.
GBLX, however, has not returned my call since I opened a high priority, customer down ticket about 1.5 hours ago. Like all other support calls to their NOC, this seems to have disappeared into nevernever land. I love the GBLX network when it works, but god help you if you ever need to talk to a clueful NOC person to fix a problem (especially after hours.)
bill
-- Regards Marshall Eubanks T.M. Eubanks Multicast Technologies, Inc 10301 Democracy Lane, Suite 410 Fairfax, Virginia 22030 Phone : 703-293-9624 Fax : 703-293-9609 e-mail : tme@multicasttech.com http://www.multicasttech.com Test your network for multicast : http://www.multicasttech.com/mt/ Status of Multicast on the Web : http://www.multicasttech.com/status/index.html
Yes, removing MBGP from the neighbor statement. Sorry for the ambiguity. bill On Wed, Jul 10, 2002 at 12:58:30PM -0400, Marshall Eubanks wrote:
Can you provide any details as to why you had to "remove multicast" - do you mean, remove MBGP ? Or is there more?
nanog wrote:
On Wed, Jul 10, 2002 at 09:17:56AM -0500, John Kristoff wrote:
On Wed, Jul 10, 2002 at 07:04:38AM -0700, nanog wrote:
Subject says it all. GBLX upgraded some edge routers to a new JunOS release (possibly 5.3 rev 24)- and now our bgp sessions continually reset with:
Jul 10 06:58:24 MST: %BGP-3-NOTIFICATION: sent to neighbor X.X.X.X 3/3 (update missing required attributes) 0 bytes
I don't know about gblx, but I saw a problem like this at our border. After JunOS was upgraded to 5.3r2.4 (other side IOS) the session was continually being reset. The bgp session between theser two peers was setup with family inet any (for multicast peering) and when that was removed, the problem went away. I also heard about a problem that may be related I2 was having with their Juniper code, it sounded related, but I haven't investigated the details yet.
John
That was it- A quick TAC case later (about 10 minutes turnaround from problem submission to resolution- upgrade IOS or remove multicast from bgp peer) and the problem is fixed. I removed multicast since it was not required on this peer, and will schedule the IOS upgrade during a more friendly maintenance window.
GBLX, however, has not returned my call since I opened a high priority, customer down ticket about 1.5 hours ago. Like all other support calls to their NOC, this seems to have disappeared into nevernever land. I love the GBLX network when it works, but god help you if you ever need to talk to a clueful NOC person to fix a problem (especially after hours.)
bill
-- Regards Marshall Eubanks
T.M. Eubanks Multicast Technologies, Inc 10301 Democracy Lane, Suite 410 Fairfax, Virginia 22030 Phone : 703-293-9624 Fax : 703-293-9609 e-mail : tme@multicasttech.com http://www.multicasttech.com
Test your network for multicast : http://www.multicasttech.com/mt/ Status of Multicast on the Web : http://www.multicasttech.com/status/index.html
On Wed, Jul 10, 2002 at 07:04:38AM -0700, nanog wrote:
Subject says it all. GBLX upgraded some edge routers to a new JunOS release (possibly 5.3 rev 24)- and now our bgp sessions continually reset with:
Jul 10 06:58:24 MST: %BGP-3-NOTIFICATION: sent to neighbor X.X.X.X 3/3 (update missing required attributes) 0 bytes
Anyone clueful at GBLX listening? We've been down for about 4 hours, and the NOC (call center) people are less than helpful.
This sounds an awful lot like a problem we saw awhile back when upgrading from JUNOS 4.x to 5.x. At some point (I don't remember exactly when, but the details should be in the case notes of PR.19592) Juniper implemented a change that which makes their box compliant with RFC 2858. However, when speaking BGP with a non-RFC-compliant box (such as a Cisco running something like 12.0(15)S) the session flaps continuously in the manner you describe because the other box expects the NEXT_HOP attribute to be present in every update message. Quoting from an email exchange I had with our Juniper rep: "The result of the change is that JUNOS no longer sends the NEXT_HOP attribute in an UPDATE message if only the MP_REACH_NLRI attribute is present. A workaround is to only use family inet unicast instead of multicast or any on all BGP sessions to those cisco routers or upgrade all of the cisco routers." You might try forcing 'nlri uni' on your side to see if that does anything. Interested parties may wish to have a look at PR.22527, which was opened at our request and adds a knob to revert back to the non-RFC-compliant behavior, which is useful during a transition period. --Jeff
<snip>
This sounds an awful lot like a problem we saw awhile back when upgrading from JUNOS 4.x to 5.x. At some point (I don't remember exactly when, but the details should be in the case notes of PR.19592) Juniper implemented a change that which makes their box compliant with RFC 2858. However, when speaking BGP with a non-RFC-compliant box (such as a Cisco running something like 12.0(15)S) the session flaps continuously in the manner you describe because the other box expects the NEXT_HOP attribute to be present in every update message. attribute in an UPDATE message if only the MP_REACH_NLRI attribute is present. A workaround is to only use family inet unicast instead of multicast or any on all BGP sessions to those cisco routers or upgrade all of the cisco routers." <snip>
Anyone have a Cisco BugID for this?
On Tue, Aug 06, 2002 at 03:11:21PM -0400, bdragon@gweep.net wrote:
<snip>
This sounds an awful lot like a problem we saw awhile back when upgrading from JUNOS 4.x to 5.x. At some point (I don't remember exactly when, but the details should be in the case notes of PR.19592) Juniper implemented a change that which makes their box compliant with RFC 2858. However, when speaking BGP with a <snip>
Anyone have a Cisco BugID for this?
CSCdv74675 bill
participants (5)
-
bdragon@gweep.net
-
Jeff Aitken
-
John Kristoff
-
Marshall Eubanks
-
nanog