I forgot to point out that on Friday 26th, I'll share the results collected through a link or a series of screenshots. Cheers, Etienne On Mon, Feb 22, 2021 at 2:15 PM Pawel Malachowski < pawmal-nanog@freebsd.lublin.pl> wrote:
Dnia Mon, Feb 22, 2021 at 01:01:45PM +0100, Etienne-Victor Depasquale napisał(a):
It is, after all, Intel's response to the problem of general-purpose scheduling of its processors - which prevents the processor from being viable under high networking loads.
It totally makes sense to busy poll under high networking load. By high networking load I mean roughly > 7 Mpps RX+TX per one x86 CPU core.
I partially agree it may be hard to mix DPDK and non-DPDK workload on a single CPU, not only because of advanced power management logic requirement for the dataplane application, but also due to LLC trashing. It heavily depends on usecase and dataset sizes, for example optimised FIB may fit nicely into cache and use only tiny, hot part of the dataset, but CGNAT Mflow mapping likely won't fit. For such a usecase I would recommand dedicated CPU or cache partitioning (CAT), if available.
In case of low volume traffic like 20-40G of IMIX one can dedicate e.g. 2 cores and interleave busy polling with halt instructions to lower the usage significantly (~60-80% core underutilisation).
-- Pawel Malachowski @pawmal80
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale