On 2018-07-15 19:00, Raymond Burkholder wrote:
On 07/15/2018 09:03 AM, Denys Fedoryshchenko wrote:
On 2018-07-14 22:05, Baldur Norddahl wrote:
I have considered OpenFlow and might do that. We have OpenFlow capable switches and I may be able to offload the work to the switch hardware. But I also consider this solution harder to get right than the idea of using Linux with tap devices. Also it appears the Openvswitch implements a different flavour of OpenFlow than the hardware switch (the hardware is limited to some fixed tables that Broadcom made up), so I might not be able to start with the software and then move on to hardware.
AFAIK openflow is suitable for datacenters, but doesn't scale well for users termination purposes. You will run out from TCAM much sooner than you expect.
Denys, could you expand on this? In a linux based solution (say with OVS), TCAM is memory/software based, and in following their dev threads, they have been optimizing flow caches continuously for various types of flows: megaflow, tiny flows, flow quantity and variety, caching, ...
When you mate OVS with something like a Mellanox Spectrum switch (via SwitchDev) for hardware based forwarding, I could see certain hardware limitations applying, but don't have first hand experience with that.
But I suppose you will see these TCAM issues on hardware only specialized openflow switches. Yes, definitely only on hardware switches and biggest issue it is vendor+hardware dependent. This means if i find "right" switch, and make your solution depending on it, and vendor decided to issue new revision, or even new firmware, there is no guarantee "unusual" setup will keep working. That what makes many people afraid to use it.
Openflow IMO by nature is built to do complex matching, and for example for typical 12-tuple it is 750-4000 entries max in switches, but you go to l2 only matching which was possible at moment i tested, on my experience, only on PF5820 - you can do L2 entries only matching, then it can go 80k flows. But again, sticking to specific vendor is not recommended. About OVS, i didnt looked much at it, as i thought it is not suitable for BNG purposes, like for tens of thousands users termination, i thought it is more about high speed switching for tens of VM.
On edge based translations, is hardware based forwarding actually necessary, since there are so many software functions being performed anyway?
IMO at current moment 20-40G on single box is a boundary point when packet forwarding is preferable(but still not necessary) to do in hardware, as passing packets thru whole Linux stack is really not best option. But it works. I'm trying to find an alternative solution, bypassing full stack using XDP, so i can go beyond 40G.
But then, it may be conceivable that by buying a number of servers, and load spreading across the servers will provide some resiliency and will come in at a lower cost than putting in 'big iron' anyway.
Because then there are some additional benefits: you can run Network Function Virtualization at the edge and provide additional services to customers.
+1 For IPoE/PPPoE - servers scale very well, while on "hardware" eventually you will hit a limit how many line cards you can put in chassis and then you need to buy new chassis. I am not talking that some chassis have countless unobvious limitations you might hit inside chassis (in pretty old Cisco 6500/7600, which is not EOL, it is a nightmare). If ISP have big enough chassis, he need to remember, that he need second one at same place, and preferable with same amount of line cards, while with servers you are more reliable even with N+M(where M for example N/4) redundancy. Also when premium customers ask me for some unusual things, it is much easier to move them to separate nodes with extended options for termination, where i can implement their demands over custom vCPE.