Re: few big monolithic PEs vs many small PEs

27 Jun 2019

      I've ran into many providers where they had routers in the top 10 or 15 markets... and that was it. If you wanted a connection in South Bend or Indianapolis or New Orleans or Ohio or... you were backhauled potentially hundreds of miles to a nearby big market. 

More smaller POPs reduces the tromboning. 

More smaller POPs means that one POP's outage isn't as disastrous on the traffic rerouting around it. 

----- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 

----- Original Message -----

From: adamv0025@netconsultings.com 
To: nanog@nanog.org 
Sent: Wednesday, June 19, 2019 3:22:45 PM 
Subject: few big monolithic PEs vs many small PEs 

Hi folks, 

Recently I ran into a peculiar situation where we had to cap couple of PE 
even though merely a half of the rather big chassis was populated with 
cards, reason being that the central RE/RP was not able to cope with the 
combined number of routes/vrfs/bgp sessions/etc.. 

So this made me think about the best strategy in building out SP-Edge 
nowadays (yes I'm aware of the centralize/decentralize pendulum swinging 
every couple of years). 
The conclusion I came to was that *currently the best approach would be to 
use several medium to small(fixed) PEs to replace a big monolithic chasses 
based system. 
So what I was thinking is, 
Yes it will cost a bit more (router is more expensive than a LC) 
Will end up with more prefixes in IGP, more BGP sessions etc.. -don't care. 
But the benefits are less eggs in one basket, simplified and hence faster 
testing in case of specialized PEs and obviously better RP CPU/MEM to port 
ratio. 
Am I missing anything please? 

*currently, 
Yes some old chassis systems or even multi-chassis systems used to support 
additional RPs and offloading some of the processes (e.g. BGP onto those) 
-problem is these are custom hacks and still a single OS which needs 
rebooting LC/ASICs when being upgraded -so the problem of too many eggs in 
one basket still exists (yes cisco NCS6k and recent ASR9k lightspeed LCs are 
an exception) 
And yes there is the "node-slicing" approach from Juniper where one can 
offload CP onto multiple x86 servers and assign LCs to each server (virtual 
node) - which would solve my chassis full problem -but honestly how many of 
you are running such setup? Exactly. And that's why I'd be hesitant to 
deploy this solution in production just yet. I don't know of any other 
vendor solution like this one, but who knows maybe in 5 years this is going 
to be the new standard. Anyways I need a solution/strategy for the next 3-5 
years. 

Would like to hear what are your thoughts on this conundrum. 

adam 

netconsultings.com 
::carrier-class solutions for the telecommunications industry::