Microsoft Express Routes woes...
This is classic....... We have a direct BGP peering session to Microsoft using Express Routes for the public peering session for services like Email, one drive, etc. this uses MS Public. We also have/use MS Azure Public as well as MS Azure Private in place for a few years now. We have had this happen a few times already where one team at MS makes a routing change, and the other team is either not aware of the change, or else doesn't communicate the change properly. So late last night, while a change was being made to a completely different area of our network, they asked that I back out my change due to our entire organization not being able to access share-point online, or MS One drive. I had zero evidence it was our change. further investigation on our border routers, revealed all four (4) of our ISP's were advertising the MS block as a /24 prefix: A:MY_NAME_CHANGED# show router 1053 bgp routes 13.107.136.0/24 ======================================================================== BGP Router ID:10.11.0.29 AS:122 Local AS:122 ======================================================================== Legend - Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid l - leaked, x - stale, > - best, b - backup, p - purge Origin codes : i - IGP, e - EGP, ? - incomplete ======================================================================== BGP IPv4 Routes ======================================================================== Flag Network LocalPref MED Nexthop (Router) Path-Id Label As-Path ------------------------------------------------------------------------------- u*>? 13.107.136.0/24 250 10450 4.49.118.153 None - 3356 8075 8068 ------------------------------------------------------------------------------- Routes : 1 HOWEVER MS Express Routes was advertising this: A:MY_NAME_CHANGED# show router 1053 bgp routes 13.107.136.0/22 ======================================================================== BGP Router ID:10.11.0.29 AS:122 Local AS:122 ======================================================================== Legend - Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid l - leaked, x - stale, > - best, b - backup, p - purge Origin codes : i - IGP, e - EGP, ? - incomplete ======================================================================== BGP IPv4 Routes ======================================================================== Flag Network LocalPref MED Nexthop (Router) Path-Id Label As-Path ------------------------------------------------------------------------------- i 13.107.136.0/22 300 0 157.229.11.66 None - 12076 So another call to MS support and escalations to get this more specific prefix fixed or if Express routes can advertise this more specific vs the /22 block. *Questions:* Has anyone else ran into this with MS if you have a direct peering session to them? Has anyone did an audit on the route table's received from MS over express routes, vs what they receive from their ISP's and noticed the differences? MS needs to seriously think about: Careful coordination of routing changes Policies to prevent specific routes being advertised while larger blocks are advertised over express routes. Anyway I am tired, as I have not had much sleep, any comments on this, would like to hear from you. -Craig
Hi Craig, Microsoft apologizes for your routing issue. Our Wide Area Networking and Express Route teams were already aware of this issue from your ticket. I have communicated with their leadership, and they are working on a fix for you. Please feel free to contact me at ingrid.erkman@microsoft.com if you would like to follow up off list. Thanks, Ingrid Erkman Director of Interconnection Microsoft On Wed, Jun 27, 2018 at 8:58 AM, Craig <cvuljanic@gmail.com> wrote:
This is classic.......
We have a direct BGP peering session to Microsoft using Express Routes for the public peering session for services like Email, one drive, etc. this uses MS Public. We also have/use MS Azure Public as well as MS Azure Private in place for a few years now. We have had this happen a few times already where one team at MS makes a routing change, and the other team is either not aware of the change, or else doesn't communicate the change properly.
So late last night, while a change was being made to a completely different area of our network, they asked that I back out my change due to our entire organization not being able to access share-point online, or MS One drive.
I had zero evidence it was our change. further investigation on our border routers, revealed all four (4) of our ISP's were advertising the MS block as a /24 prefix:
A:MY_NAME_CHANGED# show router 1053 bgp routes 13.107.136.0/24 ======================================================================== BGP Router ID:10.11.0.29 AS:122 Local AS:122 ======================================================================== Legend - Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid l - leaked, x - stale, > - best, b - backup, p - purge Origin codes : i - IGP, e - EGP, ? - incomplete
======================================================================== BGP IPv4 Routes ======================================================================== Flag Network LocalPref MED Nexthop (Router) Path-Id Label As-Path ------------------------------------------------------------ ------------------- u*>? 13.107.136.0/24 250 10450 4.49.118.153 None - 3356 8075 8068 ------------------------------------------------------------ ------------------- Routes : 1
HOWEVER MS Express Routes was advertising this:
A:MY_NAME_CHANGED# show router 1053 bgp routes 13.107.136.0/22 ======================================================================== BGP Router ID:10.11.0.29 AS:122 Local AS:122 ======================================================================== Legend - Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid l - leaked, x - stale, > - best, b - backup, p - purge Origin codes : i - IGP, e - EGP, ? - incomplete
======================================================================== BGP IPv4 Routes ======================================================================== Flag Network LocalPref MED Nexthop (Router) Path-Id Label As-Path ------------------------------------------------------------ ------------------- i 13.107.136.0/22 300 0 157.229.11.66 None - 12076
So another call to MS support and escalations to get this more specific prefix fixed or if Express routes can advertise this more specific vs the /22 block.
*Questions:* Has anyone else ran into this with MS if you have a direct peering session to them? Has anyone did an audit on the route table's received from MS over express routes, vs what they receive from their ISP's and noticed the differences?
MS needs to seriously think about: Careful coordination of routing changes Policies to prevent specific routes being advertised while larger blocks are advertised over express routes.
Anyway I am tired, as I have not had much sleep, any comments on this, would like to hear from you.
-Craig
participants (2)
-
Craig
-
Ingrid Erkman