Rogers 2022 network outage, Xona Report
After a July 2022 outage that caused the whole Rogers network to go down for an extended period (bring down the single homed Interac payment system with it across Canada), political pressure caused CRTC to have a process to look into it. Part of it was the commissioning of a report by experts. Following process, redacted portions of the XONA Partners report have been published. https://crtc.gc.ca/otf/eng/2022/8000/c12-202203868.htm (The 2024-07-31 entry from Rogers Com unications Canada, ZIP file contains the PDF report as well as Rogers letters justifying its redactions).
On 2024-08-02 21:39, Jean-Francois Mezei wrote:
Following process, redacted portions of the XONA Partners report have been published.
I have some question on terminology: (pardon my newbieness, just wanting to be pedantic on terminology). "Rogers staff removed the Access Control List policy filter from the configuration of the distribution routers. This consequently resulted in a flood of IP routing information into the core network routers, which triggered the outage." The report mentions Rogers uses IS-IS as interior routing protocol buit i'll use the more generic OSPF bellow. Questions: 1- I had always heard of routers facing the Internet (and thus doing BGP) as "edge" or "border". Is the term "distribution router" common in the industry? 2- When a border/edge router receives some 980,000 route entries from the transit provider, aren't those packets addressed to the IP address of that edge router with port 179 and going to the router's internal BGP process instead of being routed? (report makes it look those packets were left to run wild and propagate onto the intranet due to lack of ACL). Would the rules that define which BGP routes are to be converted to OSPF and then propagated to OSPF peers on the intranet be called an "Access Control List"? If not, what would they be called? (routing policy filter?) (I have always though of ACL as a packet routing rule, not of route building one). Would it be fair to state that a large ISP network would use BGP to OSPF route propagation to load balance upload-heavy site? BGP1 advertises itself to OSPF-A B and C as the router to talk to for packets desined to upload-heavy site, while BGP2 does the same for internal routers OSPF-D E and F? (Just trying to understand the scope of route information that Rogers's BGP routers would want to send to the internal routing protocol. Thanks in advance for any precisions on the above. Just want to make sure I puch right when I make requests for disclosure of the redacted portions.
Jean-Francois, Comments in line. On Fri, Aug 2, 2024, 22:59 Jean-Francois Mezei <jfmezei_nanog@vaxination.ca> wrote:
On 2024-08-02 21:39, Jean-Francois Mezei wrote:
Following process, redacted portions of the XONA Partners report have been published.
I have some question on terminology: (pardon my newbieness, just wanting to be pedantic on terminology).
"Rogers staff removed the Access Control List policy filter from the configuration of the distribution routers. This consequently resulted in a flood of IP routing information into the core network routers, which triggered the outage."
The report mentions Rogers uses IS-IS as interior routing protocol buit i'll use the more generic OSPF bellow.
Assumtion here is that only IS-IS or OSPF is used. In a dual stack network, its possible both are used - one for each protocol family .
Questions:
1- I had always heard of routers facing the Internet (and thus doing BGP) as "edge" or "border". Is the term "distribution router" common in the industry
The term was common 20+ years ago. Core / distribution / access - being common legacy terms.
2- When a border/edge router receives some 980,000 route entries from the transit provider, aren't those packets addressed to the IP address of that edge router with port 179 and going to the router's internal BGP process instead of being routed? (report makes it look those packets were left to run wild and propagate onto the intranet due to lack of ACL).
Internet edge is not the only edge in complex operator networks. Often services are colorated in central areas if a network. These area may not be eBGP but rather iBGP.
Would the rules that define which BGP routes are to be converted to OSPF and then propagated to OSPF peers on the intranet be called an "Access Control List"? If not, what would they be called? (routing policy filter?)
No. Doubt anyone actually exports bgp to igp anymore. ACLs are often used for route policy.
(I have always though of ACL as a packet routing rule, not of route building one).
Would it be fair to state that a large ISP network would use BGP to OSPF route propagation to load balance upload-heavy site? BGP1 advertises itself to OSPF-A B and C as the router to talk to for packets desined to upload-heavy site, while BGP2 does the same for internal routers OSPF-D E and F?
(Just trying to understand the scope of route information that Rogers's BGP routers would want to send to the internal routing protocol.
In short. The way the report was written, you will not be able to decipher what actually happened.
Thanks in advance for any precisions on the above. Just want to make sure I puch right when I make requests for disclosure of the redacted portions.
Victor K
participants (2)
-
Jean-Francois Mezei
-
Victor Kuarsingh