On 20 December 2017 at 16:55, Denys Fedoryshchenko <denys@visp.net.lb> wrote:
And for me, it sounds like faulty aggregation + shaping setup, for example, i heard once if i do policing on some models of Cisco switch, on an aggregated interface, if it has 4 interfaces it will install 25% policer on each interface and if hashing is done by dst ip only, i will face such issue, but that is old and cheap model, as i recall.
One such old and cheap model is ASR9k trident, typhoon and tomahawk. It's actually pretty demanding problem, as technically two linecards or even just ports sitting on two different NPU might as well be different routers, they don't have good way to communicate to each other on BW use. So N policer being installed as N/member_count per link is very typical. ECMP is fact of life, and even thought none if any provider document that they have per-flow limitations which are lower than nominal rate of connection you purchases, these do exist almost universally everywhere. People who are most likely to see these limits are people who tunnel everything, so that everything from their say 10Gbps is single flow, from POV of the network. In IPv6 world at least tunnel encap end could write hash to IPv6 flow label, allowing core potentially to balance tunneled traffic, unless tunnel itself guarantees order. I don't think it's fair for operator to demand equal bandwidth per IP, but you will expose yourself to more problems if you do not have sufficient entropy. We are slowly getting solutions to this, Juniper Trio and BRCM Tomahawk3 can detect elephant flows and dynamically unequally map hash results to physical ports to alleviate the problem. -- ++ytti