Hello folks, I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK. As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK. I've drawn up a bare-bones, 2-question survey at this link: https://www.surveymonkey.com/r/J886DPY. Responses have been set to anonymous. Cheers, Etienne -- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
I'm very happy to see interest in DPDK and power consumption. But IMHO, the questions do not cover the actual reality of DPDK. That característic of "100% CPU" depends on several aspects, like: - How old are the hardware on DPDK. - What type of DPDK Instructions are made(Very Dynamic as Statefull CGNAT, ou Static ACLs?) - Using or not the measurements of DPDK Input/Drop/Fowarding. - CPU Affinity done according to the demand of traffic - SR-IOV (sharing resources) on DPDK. The way I saw, the questions induce the public to conclude that DPDK ALWAYS has 100% CPU usage, which is not true. Em seg., 22 de fev. de 2021 às 04:30, Etienne-Victor Depasquale < edepa@ieee.org> escreveu:
Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Douglas Fernando Fischer Engº de Controle e Automação
The way I saw, the questions induce the public to conclude that DPDK ALWAYS has 100% CPU usage, which is not true.
I don't concur. Every research paper I've read indicates that, regardless of whether it has packets to process or not, DPDK PMDs (poll-mode drivers) prevent the CPU from falling into an LPI (low-power idle). When it has no packets to process, the PMD runs the processor in a polling loop that keeps utilization of the running core at 100%. Cheers, Etienne On Mon, Feb 22, 2021 at 12:33 PM Douglas Fischer <fischerdouglas@gmail.com> wrote:
I'm very happy to see interest in DPDK and power consumption.
But IMHO, the questions do not cover the actual reality of DPDK. That característic of "100% CPU" depends on several aspects, like: - How old are the hardware on DPDK. - What type of DPDK Instructions are made(Very Dynamic as Statefull CGNAT, ou Static ACLs?) - Using or not the measurements of DPDK Input/Drop/Fowarding. - CPU Affinity done according to the demand of traffic - SR-IOV (sharing resources) on DPDK.
The way I saw, the questions induce the public to conclude that DPDK ALWAYS has 100% CPU usage, which is not true.
Em seg., 22 de fev. de 2021 às 04:30, Etienne-Victor Depasquale < edepa@ieee.org> escreveu:
Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Douglas Fernando Fischer Engº de Controle e Automação
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Here are a few references. Strictly speaking, DPDK and SR-IOV are orthogonal. DPDK is intended to facilitate cloud-native operation through hardware independence. SR-IOV presumes SR-IOV-compliant hardware. [1] Z. Xu, F. Liu, T. Wang, and H. Xu, “Demystifying the energy efficiency of Network Function Virtualization,” in 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS), Jun. 2016, pp. 1–10. DOI: 10.1109/IWQoS.2016.7590429. [2] S. Fu, J. Liu, and W. Zhu, “Multimedia Content Delivery with Network Function Virtualization: The Energy Perspective,” IEEE MultiMedia, vol. 24, no. 3, pp. 38–47, 2017, ISSN: 1941-0166. DOI: 10.1109/MMUL.2017.3051514. [3] X. Li, W. Cheng, T. Zhang, F. Ren, and B. Yang, “Towards Power Efficient High Performance Packet I/O,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 4, pp. 981–996, April 2020, ISSN:1558-2183. DOI: 10.1109/TPDS.2019.2957746. [4] G. Li, D. Zhang, Y. Li, and K. Li, “Toward energy efficiency optimization of pktgen-DPDK for green network testbeds,” China Communications, vol. 15, no. 11, pp. 199–207, November 2018, ISSN: 1673-5447. DOI: 10.1109/CC.2018.8543100. On Mon, Feb 22, 2021 at 12:45 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
The way I saw, the questions induce the public to conclude that DPDK
ALWAYS has 100% CPU usage, which is not true.
I don't concur.
Every research paper I've read indicates that, regardless of whether it has packets to process or not, DPDK PMDs (poll-mode drivers) prevent the CPU from falling into an LPI (low-power idle).
When it has no packets to process, the PMD runs the processor in a polling loop that keeps utilization of the running core at 100%.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 12:33 PM Douglas Fischer <fischerdouglas@gmail.com> wrote:
I'm very happy to see interest in DPDK and power consumption.
But IMHO, the questions do not cover the actual reality of DPDK. That característic of "100% CPU" depends on several aspects, like: - How old are the hardware on DPDK. - What type of DPDK Instructions are made(Very Dynamic as Statefull CGNAT, ou Static ACLs?) - Using or not the measurements of DPDK Input/Drop/Fowarding. - CPU Affinity done according to the demand of traffic - SR-IOV (sharing resources) on DPDK.
The way I saw, the questions induce the public to conclude that DPDK ALWAYS has 100% CPU usage, which is not true.
Em seg., 22 de fev. de 2021 às 04:30, Etienne-Victor Depasquale < edepa@ieee.org> escreveu:
Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Douglas Fernando Fischer Engº de Controle e Automação
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Dnia Mon, Feb 22, 2021 at 12:45:52PM +0100, Etienne-Victor Depasquale napisał(a):
Every research paper I've read indicates that, regardless of whether it has packets to process or not, DPDK PMDs (poll-mode drivers) prevent the CPU from falling into an LPI (low-power idle).
When it has no packets to process, the PMD runs the processor in a polling loop that keeps utilization of the running core at 100%.
No, it is not PMD that runs the processor in a polling loop. It is the application itself, thay may or may not busy loop, depending on application programmers choice. -- Pawel Malachowski @pawmal80
No, it is not PMD that runs the processor in a polling loop. It is the application itself, thay may or may not busy loop, depending on application programmers choice.
From one of my earlier references [2]:
"we found that a poll mode driver (PMD) thread accounted for approximately 99.7 percent CPU occupancy (a full core utilization)." And further on: "we found that the thread kept spinning on the following code block: *for ( ; ; ) {for ( i = 0; i < poll_cnt; i ++) {dp_netdev_process_rxq_port (pmd, list[i].port, poll_list[i].rx) ;}}* This indicates that the thread was continuously monitoring and executing the receiving data path." [2] S. Fu, J. Liu, and W. Zhu, “Multimedia Content Delivery with Network Function Virtualization: The Energy Perspective,” IEEE MultiMedia, vol. 24, no. 3, pp. 38–47, 2017, ISSN: 1941-0166. DOI: 10.1109/MMUL.2017.3051514. On Tue, Feb 23, 2021 at 12:59 PM Pawel Malachowski < pawmal-nanog@freebsd.lublin.pl> wrote:
Dnia Mon, Feb 22, 2021 at 12:45:52PM +0100, Etienne-Victor Depasquale napisał(a):
Every research paper I've read indicates that, regardless of whether it has packets to process or not, DPDK PMDs (poll-mode drivers) prevent the CPU from falling into an LPI (low-power idle).
When it has no packets to process, the PMD runs the processor in a polling loop that keeps utilization of the running core at 100%.
No, it is not PMD that runs the processor in a polling loop. It is the application itself, thay may or may not busy loop, depending on application programmers choice.
-- Pawel Malachowski @pawmal80
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Etienne-Victor Depasquale wrote on 23/02/2021 16:03:
"we found that a poll mode driver (PMD) thread accounted for approximately 99.7 percent CPU occupancy (a full core utilization)."
interrupt-driven network drivers generally can't compete with polled mode drivers at higher throughputs on generic CPU / PCI card systems. On this style of config, you optimise your driver parameters based on what works best under the specific conditions. Polled mode drivers have been around for a while, e.g.
https://svnweb.freebsd.org/base?view=revision&revision=87902
Nick
For use cases where DPDK matters, are you really concerned with power consumption? On Tue, Feb 23, 2021 at 11:48 AM Nick Hilliard <nick@foobar.org> wrote:
Etienne-Victor Depasquale wrote on 23/02/2021 16:03:
"we found that a poll mode driver (PMD) thread accounted for approximately 99.7 percent CPU occupancy (a full core utilization)."
interrupt-driven network drivers generally can't compete with polled mode drivers at higher throughputs on generic CPU / PCI card systems. On this style of config, you optimise your driver parameters based on what works best under the specific conditions.
Polled mode drivers have been around for a while, e.g.
https://svnweb.freebsd.org/base?view=revision&revision=87902
Nick
Shane Ronan wrote on 23/02/2021 16:59:
For use cases where DPDK matters, are you really concerned with power consumption?
Probably yeah. Have you assessed the lifetime cost of running a multicore CPU at 100% vs at 10%, particularly as you're likely to have multiples of these devices in operation? Nick
Probably yeah. Have you assessed the lifetime cost of running a multicore CPU at 100% vs at 10%, particularly as you're likely to have multiples of these devices in operation?
Spot on. On Tue, Feb 23, 2021 at 6:07 PM Nick Hilliard <nick@foobar.org> wrote:
Shane Ronan wrote on 23/02/2021 16:59:
For use cases where DPDK matters, are you really concerned with power consumption?
Probably yeah. Have you assessed the lifetime cost of running a multicore CPU at 100% vs at 10%, particularly as you're likely to have multiples of these devices in operation?
Nick
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
No, it is not PMD that runs the processor in a polling loop. It is the application itself, thay may or may not busy loop, depending on application programmers choice.
From one of my earlier references [2]:
"we found that a poll mode driver (PMD) thread accounted for approximately 99.7 percent CPU occupancy (a full core utilization)."
And further on:
"we found that the thread kept spinning on the following code block:
*for ( ; ; ) {for ( i = 0; i < poll_cnt; i ++) {dp_netdev_process_rxq_port (pmd, list[i].port, poll_list[i].rx) ;}}* This indicates that the thread was continuously monitoring and executing the receiving data path."
This comes from OVS code and shows OVS thread spinning, not DPDK PMD. Blame the OVS application for not using e.g. _mm_pause() and burning the CPU like crazy. For comparison, take a look at top+i7z output from DPDK-based 100G DDoS scrubber currently lifting some low traffic using cores 1-13 on 16 core host. It uses naive DPDK::rte_pause() throttling to enter C1. Tasks: 342 total, 1 running, 195 sleeping, 0 stopped, 0 zombie %Cpu(s): 6.6 us, 0.6 sy, 0.0 ni, 89.7 id, 3.1 wa, 0.0 hi, 0.0 si, 0.0 st Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % Temp VCore Core 1 [0]: 1467.73 (14.68x) 2.15 5.35 1 92.3 43 0.6724 Core 2 [1]: 1201.09 (12.01x) 11.7 93.9 0 0 39 0.6575 Core 3 [2]: 1200.06 (12.00x) 11.8 93.8 0 0 42 0.6543 Core 4 [3]: 1200.14 (12.00x) 11.8 93.8 0 0 41 0.6549 Core 5 [4]: 1200.10 (12.00x) 11.8 93.8 0 0 41 0.6526 Core 6 [5]: 1200.12 (12.00x) 11.8 93.8 0 0 40 0.6559 Core 7 [6]: 1201.01 (12.01x) 11.8 93.8 0 0 41 0.6559 Core 8 [7]: 1201.02 (12.01x) 11.8 93.8 0 0 43 0.6525 Core 9 [8]: 1201.00 (12.01x) 11.8 93.8 0 0 41 0.6857 Core 10 [9]: 1201.04 (12.01x) 11.8 93.8 0 0 40 0.6541 Core 11 [10]: 1201.95 (12.02x) 13.6 92.9 0 0 40 0.6558 Core 12 [11]: 1201.02 (12.01x) 11.8 93.8 0 0 42 0.6526 Core 13 [12]: 1204.97 (12.05x) 17.6 90.8 0 0 45 0.6814 Core 14 [13]: 1248.39 (12.48x) 28.2 84.7 0 0 41 0.6855 Core 15 [14]: 2790.74 (27.91x) 91.9 0 1 1 41 0.8885 <-- not PMD Core 16 [15]: 1262.29 (12.62x) 13.1 34.9 1.7 56.2 43 0.6616 $ dataplanectl stats fcore | grep total fcore total idle 393788223887 work 860443658 (0.2%) (forced-idle 7458486526622) recv 202201388561 drop 61259353721 (30.3%) limit 269909758 (0.1%) pass 140606076622 (69.6%) ingress 66048460 (0.0%/0.0%) sent 162580376914 (80.4%/100.0%) overflow 0 (0.0%) sampled 628488188/628488188 -- Pawel Malachowski @pawmal80
This comes from OVS code and shows OVS thread spinning, not DPDK PMD. Blame the OVS application for not using e.g. _mm_pause() and burning the CPU like crazy.
OK, I'm citing a bit more from the same reference: *"By tracing back to the function’s caller * *in the PMD thread main(void *f_), * we found that the thread kept spinning on the following code block: for ( ; ; ) { for ( i = 0; i < poll_cnt; i ++) { dp_netdev_process_rxq_port (pmd, list[i].port, poll_list[i].rx) ; } } This indicates that the [PMD] thread was continuously monitoring and executing the receiving data path." Cheers, Etienne On Tue, Feb 23, 2021 at 10:33 PM Pawel Malachowski < pawmal-nanog@freebsd.lublin.pl> wrote:
No, it is not PMD that runs the processor in a polling loop. It is the application itself, thay may or may not busy loop, depending on application programmers choice.
From one of my earlier references [2]:
"we found that a poll mode driver (PMD) thread accounted for approximately 99.7 percent CPU occupancy (a full core utilization)."
And further on:
"we found that the thread kept spinning on the following code block:
*for ( ; ; ) {for ( i = 0; i < poll_cnt; i ++) {dp_netdev_process_rxq_port (pmd, list[i].port, poll_list[i].rx) ;}}* This indicates that the thread was continuously monitoring and executing the receiving data path."
This comes from OVS code and shows OVS thread spinning, not DPDK PMD. Blame the OVS application for not using e.g. _mm_pause() and burning the CPU like crazy.
For comparison, take a look at top+i7z output from DPDK-based 100G DDoS scrubber currently lifting some low traffic using cores 1-13 on 16 core host. It uses naive DPDK::rte_pause() throttling to enter C1.
Tasks: 342 total, 1 running, 195 sleeping, 0 stopped, 0 zombie %Cpu(s): 6.6 us, 0.6 sy, 0.0 ni, 89.7 id, 3.1 wa, 0.0 hi, 0.0 si, 0.0 st
Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % Temp VCore Core 1 [0]: 1467.73 (14.68x) 2.15 5.35 1 92.3 43 0.6724 Core 2 [1]: 1201.09 (12.01x) 11.7 93.9 0 0 39 0.6575 Core 3 [2]: 1200.06 (12.00x) 11.8 93.8 0 0 42 0.6543 Core 4 [3]: 1200.14 (12.00x) 11.8 93.8 0 0 41 0.6549 Core 5 [4]: 1200.10 (12.00x) 11.8 93.8 0 0 41 0.6526 Core 6 [5]: 1200.12 (12.00x) 11.8 93.8 0 0 40 0.6559 Core 7 [6]: 1201.01 (12.01x) 11.8 93.8 0 0 41 0.6559 Core 8 [7]: 1201.02 (12.01x) 11.8 93.8 0 0 43 0.6525 Core 9 [8]: 1201.00 (12.01x) 11.8 93.8 0 0 41 0.6857 Core 10 [9]: 1201.04 (12.01x) 11.8 93.8 0 0 40 0.6541 Core 11 [10]: 1201.95 (12.02x) 13.6 92.9 0 0 40 0.6558 Core 12 [11]: 1201.02 (12.01x) 11.8 93.8 0 0 42 0.6526 Core 13 [12]: 1204.97 (12.05x) 17.6 90.8 0 0 45 0.6814 Core 14 [13]: 1248.39 (12.48x) 28.2 84.7 0 0 41 0.6855 Core 15 [14]: 2790.74 (27.91x) 91.9 0 1 1 41 0.8885 <-- not PMD Core 16 [15]: 1262.29 (12.62x) 13.1 34.9 1.7 56.2 43 0.6616
$ dataplanectl stats fcore | grep total fcore total idle 393788223887 work 860443658 (0.2%) (forced-idle 7458486526622) recv 202201388561 drop 61259353721 (30.3%) limit 269909758 (0.1%) pass 140606076622 (69.6%) ingress 66048460 (0.0%/0.0%) sent 162580376914 (80.4%/100.0%) overflow 0 (0.0%) sampled 628488188/628488188
-- Pawel Malachowski @pawmal80
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Dnia Mon, Feb 22, 2021 at 08:33:35AM -0300, Douglas Fischer napisał(a):
But IMHO, the questions do not cover the actual reality of DPDK. That característic of "100% CPU" depends on several aspects, like: - How old are the hardware on DPDK. - What type of DPDK Instructions are made(Very Dynamic as Statefull CGNAT, ou Static ACLs?) - Using or not the measurements of DPDK Input/Drop/Fowarding. - CPU Affinity done according to the demand of traffic - SR-IOV (sharing resources) on DPDK.
It consumes 100% only if you busy poll (which is the default approach). One can switch between polling and interrupts (or monitor, if supported), or introduce halt instructions, in case of low/medium traffic volume. -- Pawel Malachowski @pawmal80
It consumes 100% only if you busy poll (which is the default approach).
Precisely. It is, after all, Intel's response to the problem of general-purpose scheduling of its processors - which prevents the processor from being viable under high networking loads. Cheers, Etienne On Mon, Feb 22, 2021 at 12:58 PM Pawel Malachowski < pawmal-nanog@freebsd.lublin.pl> wrote:
Dnia Mon, Feb 22, 2021 at 08:33:35AM -0300, Douglas Fischer napisał(a):
But IMHO, the questions do not cover the actual reality of DPDK. That característic of "100% CPU" depends on several aspects, like: - How old are the hardware on DPDK. - What type of DPDK Instructions are made(Very Dynamic as Statefull CGNAT, ou Static ACLs?) - Using or not the measurements of DPDK Input/Drop/Fowarding. - CPU Affinity done according to the demand of traffic - SR-IOV (sharing resources) on DPDK.
It consumes 100% only if you busy poll (which is the default approach). One can switch between polling and interrupts (or monitor, if supported), or introduce halt instructions, in case of low/medium traffic volume.
-- Pawel Malachowski @pawmal80
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Dnia Mon, Feb 22, 2021 at 01:01:45PM +0100, Etienne-Victor Depasquale napisał(a):
It is, after all, Intel's response to the problem of general-purpose scheduling of its processors - which prevents the processor from being viable under high networking loads.
It totally makes sense to busy poll under high networking load. By high networking load I mean roughly > 7 Mpps RX+TX per one x86 CPU core. I partially agree it may be hard to mix DPDK and non-DPDK workload on a single CPU, not only because of advanced power management logic requirement for the dataplane application, but also due to LLC trashing. It heavily depends on usecase and dataset sizes, for example optimised FIB may fit nicely into cache and use only tiny, hot part of the dataset, but CGNAT Mflow mapping likely won't fit. For such a usecase I would recommand dedicated CPU or cache partitioning (CAT), if available. In case of low volume traffic like 20-40G of IMIX one can dedicate e.g. 2 cores and interleave busy polling with halt instructions to lower the usage significantly (~60-80% core underutilisation). -- Pawel Malachowski @pawmal80
I forgot to point out that on Friday 26th, I'll share the results collected through a link or a series of screenshots. Cheers, Etienne On Mon, Feb 22, 2021 at 2:15 PM Pawel Malachowski < pawmal-nanog@freebsd.lublin.pl> wrote:
Dnia Mon, Feb 22, 2021 at 01:01:45PM +0100, Etienne-Victor Depasquale napisał(a):
It is, after all, Intel's response to the problem of general-purpose scheduling of its processors - which prevents the processor from being viable under high networking loads.
It totally makes sense to busy poll under high networking load. By high networking load I mean roughly > 7 Mpps RX+TX per one x86 CPU core.
I partially agree it may be hard to mix DPDK and non-DPDK workload on a single CPU, not only because of advanced power management logic requirement for the dataplane application, but also due to LLC trashing. It heavily depends on usecase and dataset sizes, for example optimised FIB may fit nicely into cache and use only tiny, hot part of the dataset, but CGNAT Mflow mapping likely won't fit. For such a usecase I would recommand dedicated CPU or cache partitioning (CAT), if available.
In case of low volume traffic like 20-40G of IMIX one can dedicate e.g. 2 cores and interleave busy polling with halt instructions to lower the usage significantly (~60-80% core underutilisation).
-- Pawel Malachowski @pawmal80
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization. I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself. I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc. ~Jared On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns. In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree. -robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.”
Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour. Let me know what you think; otherwise, I'm pretty confident that DPDK does:
"keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is. Cheers, Etienne On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org> wrote:
Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Sorry, last line should have been: "intended to get an impression of how widespread ***knowledge of*** DPDK's core operating inefficiency is", not: "intended to get an impression of how widespread DPDK's core operating inefficiency is" On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption
by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.”
Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour.
Let me know what you think; otherwise, I'm pretty confident that DPDK does:
"keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org> wrote:
Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Hi Etienne, Your statement that DPDK “keeps utilization at 100% regardless of packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion. Your statements, taken at face value, lead people to believe that if a project uses DPDK it’s going to increase their power costs. And that’s just not the case. Please don’t mislead the community into believing that DPDK == power bad. Everything following is informational. Stop here if so inclined. DPDK does not dictate CPU utilization or power consumption, the application leveraging DPDK does. It’s the application that decides how to poll packets. If an application implements DPDK using only a tight polling loop, then it will keep CPU cores that are running DPDK threads at 100%. But only the most simple and/or bespoke (think trading) applications are implemented this way. You don’t need tight polling all the time to get the performance gains provided by DPDK or similar environments. The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time. An interesting, helpful research effort you could lead would be to survey the ecosystem to catalog those applications that do fall into the power hungry category and help them to change their code. Intel DPDK application development guidelines don’t pre-suppose tight polling all the time and offer at least two methods for optimizing power against throughput. The older method is to use adaptive polling; increasing the polling frequency as traffic load increases. This keeps cpu utilization low when packet load is light and increases it as traffic levels warrant. The second method is to use P-states and/or C-states to put the processor into lower power modes when traffic loads are lighter. We have found that adaptive polling works better across a larger pool of hardware types, and therefore that is what DANOS uses, amongst other things. Further, performance and power consumption are dictated by a multivariate set of application decisions including: design patterns such as single thread run to completion models vs. passing mbufs between multiple threads, buffer sizes and cache management algorithms, combining and/or separating tx/rx threads, binding threads to specific lcores, reserved cores for DPDK threads, hyperthreading, kernel schedulers, hypervisor schedulers, interface drivers, etc. All of these are application specific, not DPDK generic. Well written applications that leverage DPDK provide knobs for the user to tune these settings for their specific environment and use case. None of this unique to DPDK. Solution designs were cribbed from previous technologies. The takeaway is that DPDK (and similar) doesn’t guarantee runaway power bills. Power consumption is dictated by the application. Look for well behaved applications and everything will be alright. If you have questions, I’d be happy to discuss off line. Thanks, Robert.
On Feb 22, 2021, at 11:27 PM, Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Sorry, last line should have been: "intended to get an impression of how widespread ***knowledge of*** DPDK's core operating inefficiency is", not: "intended to get an impression of how widespread DPDK's core operating inefficiency is"
On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale <edepa@ieee.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour.
Let me know what you think; otherwise, I'm pretty confident that DPDK does: "keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Hello Robert, Your statement that DPDK “keeps utilization at 100% regardless of packet
activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion.
This statement is incorrect. I have provided references (please see earlier e-mails) that investigate the operation of DPDK. These references are items of peer-reviewed research that investigate a perceived problem with deployment of DPDK. If the power consumption incurred while running DPDK were a corner case, then there would be little to no research value in investigating such behavior. Please don’t mislead the community into believing that DPDK == power bad
I have to object to this statement. It does seem to imply malice, or, at best, amateurish behaviour, whether you intended it or not. Everything following is informational. Stop here if so inclined.
Please stop delving into the detail of DPDK's facilities without regard for your logical omission: that whether the facilities are available or not, DPDK's deployment profile (meaning: how it's being used in general), as indicated by the references I've provided, are leading to high power inefficiency on cores partitioned to the data plane. The takeaway is that DPDK (and similar) doesn’t guarantee runaway power
bills.
Of course it doesn't. Even the second question of that bare-bones survey tried to communicate this much. If you have questions, I’d be happy to discuss off line
I would be happy to answer your objections in detail off line too. Just let me know. Cheers, Etienne On Wed, Feb 24, 2021 at 12:12 AM Robert Bays <robert@gdk.org> wrote:
Hi Etienne,
Your statement that DPDK “keeps utilization at 100% regardless of packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion. Your statements, taken at face value, lead people to believe that if a project uses DPDK it’s going to increase their power costs. And that’s just not the case. Please don’t mislead the community into believing that DPDK == power bad.
Everything following is informational. Stop here if so inclined.
DPDK does not dictate CPU utilization or power consumption, the application leveraging DPDK does. It’s the application that decides how to poll packets. If an application implements DPDK using only a tight polling loop, then it will keep CPU cores that are running DPDK threads at 100%. But only the most simple and/or bespoke (think trading) applications are implemented this way. You don’t need tight polling all the time to get the performance gains provided by DPDK or similar environments. The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time. An interesting, helpful research effort you could lead would be to survey the ecosystem to catalog those applications that do fall into the power hungry category and help them to change their code.
Intel DPDK application development guidelines don’t pre-suppose tight polling all the time and offer at least two methods for optimizing power against throughput. The older method is to use adaptive polling; increasing the polling frequency as traffic load increases. This keeps cpu utilization low when packet load is light and increases it as traffic levels warrant. The second method is to use P-states and/or C-states to put the processor into lower power modes when traffic loads are lighter. We have found that adaptive polling works better across a larger pool of hardware types, and therefore that is what DANOS uses, amongst other things.
Further, performance and power consumption are dictated by a multivariate set of application decisions including: design patterns such as single thread run to completion models vs. passing mbufs between multiple threads, buffer sizes and cache management algorithms, combining and/or separating tx/rx threads, binding threads to specific lcores, reserved cores for DPDK threads, hyperthreading, kernel schedulers, hypervisor schedulers, interface drivers, etc. All of these are application specific, not DPDK generic. Well written applications that leverage DPDK provide knobs for the user to tune these settings for their specific environment and use case. None of this unique to DPDK. Solution designs were cribbed from previous technologies.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power bills. Power consumption is dictated by the application. Look for well behaved applications and everything will be alright.
If you have questions, I’d be happy to discuss off line.
Thanks, Robert.
On Feb 22, 2021, at 11:27 PM, Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Sorry, last line should have been: "intended to get an impression of how widespread ***knowledge of*** DPDK's core operating inefficiency is", not: "intended to get an impression of how widespread DPDK's core operating inefficiency is"
On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour.
Let me know what you think; otherwise, I'm pretty confident that DPDK does: "keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
The statement used on the survey "Are you aware that use of DPDK on a processor core keeps utilization at 100% regardless of packet activity?" can be easily distorted and badly used. I sincerely do not agree with the approach of presuming and declaring "DPDK spent too much power". Mainly because I lived some migrations where dedicated hardware was replaced by the consolidation of servers using NICs with DPDK, and what justified the CAPEX of these replacements was the energy savings (in addition to the capacity expansion) I agree with Pawell when he says that is the Application that defines power consumption, and it can be improved or not by the skills of the developer... But, for exemplification, you won't see a Stateful NAT Scenario that does not do ConnTrack Lookups... And those lookups spent CPU cycles, consequently spent energy. My suggestion to clear things up is to do an honest comparison between DPDK implementations against Fully Hardware-Based Implementations. Considering CAPEX, and OPEX(from energy power and others). To that, would be necessary to have information on both possibilities. A) Simulate(with synthetic traffic generator) several scenarios using DPDK, and measure all the relevant data on that(traffic, latency, time of CPU dispended, power consumption, etc) Between those scenarios(not an exhaustive list): A.1) Soft-Based Router Scenario(stateless) A.2) Filtering packets(stateless) A.3) Filtering packets(stateful) A.4) CGNAT(obviously stateful) In each of those scenarios, test(when applicable) the possibilities Statistics(considering that obtaining accurate statistics implicates more talking between northbridge and southbridge) i) Capturing packets statistics (input, drop, forwarding) j) Without Capturing packets statistics (input, drop, forwarding) Poll-Mode Drivers vs Low-Power-Idle against Traffic Demand y) Stateless instructions to DPDK, without collecting statistics demand near-zero interactions between CPU and NIC, so LPI is applicable. z) Even Stateful scenarios, with accurate measuring, demand fewer interactions between CPU and NIC when traffic is low (automatic redefinition of the period of pooling cycles) B) Obtain that kind of information on other solutions available in market B.1) Hardware-Based Router (Cisco, Juniper, Huawei, Nokia) B.2) Stateless Filtering network devices (Ex.: L3 switchs and ACLs) B.3) Stateful Filtering network devices (Ex.: Firewalls) B.4) CGNAT Solutions (A10, F5, Hillstone, Juniper, Cisco, Etc...) Em qua., 24 de fev. de 2021 às 04:48, Etienne-Victor Depasquale < edepa@ieee.org> escreveu:
Hello Robert,
Your statement that DPDK “keeps utilization at 100% regardless of packet
activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion.
This statement is incorrect. I have provided references (please see earlier e-mails) that investigate the operation of DPDK. These references are items of peer-reviewed research that investigate a perceived problem with deployment of DPDK. If the power consumption incurred while running DPDK were a corner case, then there would be little to no research value in investigating such behavior.
Please don’t mislead the community into believing that DPDK == power bad
I have to object to this statement. It does seem to imply malice, or, at best, amateurish behaviour, whether you intended it or not.
Everything following is informational. Stop here if so inclined.
Please stop delving into the detail of DPDK's facilities without regard for your logical omission: that whether the facilities are available or not, DPDK's deployment profile (meaning: how it's being used in general), as indicated by the references I've provided, are leading to high power inefficiency on cores partitioned to the data plane.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power
bills.
Of course it doesn't. Even the second question of that bare-bones survey tried to communicate this much.
If you have questions, I’d be happy to discuss off line
I would be happy to answer your objections in detail off line too. Just let me know.
Cheers,
Etienne
On Wed, Feb 24, 2021 at 12:12 AM Robert Bays <robert@gdk.org> wrote:
Hi Etienne,
Your statement that DPDK “keeps utilization at 100% regardless of packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion. Your statements, taken at face value, lead people to believe that if a project uses DPDK it’s going to increase their power costs. And that’s just not the case. Please don’t mislead the community into believing that DPDK == power bad.
Everything following is informational. Stop here if so inclined.
DPDK does not dictate CPU utilization or power consumption, the application leveraging DPDK does. It’s the application that decides how to poll packets. If an application implements DPDK using only a tight polling loop, then it will keep CPU cores that are running DPDK threads at 100%. But only the most simple and/or bespoke (think trading) applications are implemented this way. You don’t need tight polling all the time to get the performance gains provided by DPDK or similar environments. The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time. An interesting, helpful research effort you could lead would be to survey the ecosystem to catalog those applications that do fall into the power hungry category and help them to change their code.
Intel DPDK application development guidelines don’t pre-suppose tight polling all the time and offer at least two methods for optimizing power against throughput. The older method is to use adaptive polling; increasing the polling frequency as traffic load increases. This keeps cpu utilization low when packet load is light and increases it as traffic levels warrant. The second method is to use P-states and/or C-states to put the processor into lower power modes when traffic loads are lighter. We have found that adaptive polling works better across a larger pool of hardware types, and therefore that is what DANOS uses, amongst other things.
Further, performance and power consumption are dictated by a multivariate set of application decisions including: design patterns such as single thread run to completion models vs. passing mbufs between multiple threads, buffer sizes and cache management algorithms, combining and/or separating tx/rx threads, binding threads to specific lcores, reserved cores for DPDK threads, hyperthreading, kernel schedulers, hypervisor schedulers, interface drivers, etc. All of these are application specific, not DPDK generic. Well written applications that leverage DPDK provide knobs for the user to tune these settings for their specific environment and use case. None of this unique to DPDK. Solution designs were cribbed from previous technologies.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power bills. Power consumption is dictated by the application. Look for well behaved applications and everything will be alright.
If you have questions, I’d be happy to discuss off line.
Thanks, Robert.
On Feb 22, 2021, at 11:27 PM, Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Sorry, last line should have been: "intended to get an impression of how widespread ***knowledge of*** DPDK's core operating inefficiency is", not: "intended to get an impression of how widespread DPDK's core operating inefficiency is"
On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour.
Let me know what you think; otherwise, I'm pretty confident that DPDK does: "keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Douglas Fernando Fischer Engº de Controle e Automação
To the nanog community, I’m sorry to have dragged this conversation out further. I'm only responding to this because there are a significant number of open source projects and commercial products that use DPDK, or similar userspace network environment in their implementations. The statements in this thread incorrectly cast them, because they use DPDK, as inefficient. But the reality is they have all been designed from day one not to unnecessarily consume power. Please ask your open source dev and/or vendor of choice to verify. But please don’t rely on the information in this thread to make decisions about what you deploy in your network.
On Feb 23, 2021, at 11:44 PM, Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Hello Robert,
Your statement that DPDK “keeps utilization at 100% regardless of packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion.
This statement is incorrect. I have provided references (please see earlier e-mails) that investigate the operation of DPDK. These references are items of peer-reviewed research that investigate a perceived problem with deployment of DPDK. If the power consumption incurred while running DPDK were a corner case, then there would be little to no research value in investigating such behavior.
Your references don’t take into account the code that this community would actually deploy; open source implementations like DANOS, FD.io <http://fd.io/>, or OVS. They don’t audit any commercial products that implement userspace stacks. None of your references say that DPDK is inherently inefficient. The closest they come is to say that tight polling is inefficient. But tight polling, even in the earliest days of DPDK, was never meant to be a design pattern that was actually deployed into production. I was there for those early conversations.
Please don’t mislead the community into believing that DPDK == power bad I have to object to this statement. It does seem to imply malice, or, at best, amateurish behaviour, whether you intended it or not.
Object all you want. You are misleading people with your comments. And in the process you are denigrating a large swath of OSS projects and commercial products that use DPDK. Your survey questions are leading and provide a false dichotomy. And when you post the results here, they will be archived forever to continue to spread misinformation, unfortunately.
Everything following is informational. Stop here if so inclined. Please stop delving into the detail of DPDK's facilities without regard for your logical omission: that whether the facilities are available or not, DPDK's deployment profile (meaning: how it's being used in general), as indicated by the references I've provided, are leading to high power inefficiency on cores partitioned to the data plane.
I’ve been writing network appliance code for over 20 years. I designed network architectures for years before that. I have 10s of thousands of DPDK based appliances in production at this moment across multiple different use cases. I work with companies that have 100s of thousands of units in production that leverage userspace runtimes. I do think I understand DPDK’s deployment profile better than you. That’s what I have been trying to tell you. People don’t write inefficient DPDK code to put into production. We’re not dumb. We’ve been thinking about power consumption from day one. DPDK was never supposed to be just a tight loop poll. You were always supposed to put in the very minimal extra work to modulate power consumption.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power bills. Of course it doesn't. Even the second question of that bare-bones survey tried to communicate this much.
If you have questions, I’d be happy to discuss off line I would be happy to answer your objections in detail off line too. Just let me know.
Unfortunately, you don’t seem to be receptive to the numerous people contradicting your assertions. So I’m out. I’ll let my comments stand here.
Cheers,
Etienne
On Wed, Feb 24, 2021 at 12:12 AM Robert Bays <robert@gdk.org <mailto:robert@gdk.org>> wrote: Hi Etienne,
Your statement that DPDK “keeps utilization at 100% regardless of packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion. Your statements, taken at face value, lead people to believe that if a project uses DPDK it’s going to increase their power costs. And that’s just not the case. Please don’t mislead the community into believing that DPDK == power bad.
Everything following is informational. Stop here if so inclined.
DPDK does not dictate CPU utilization or power consumption, the application leveraging DPDK does. It’s the application that decides how to poll packets. If an application implements DPDK using only a tight polling loop, then it will keep CPU cores that are running DPDK threads at 100%. But only the most simple and/or bespoke (think trading) applications are implemented this way. You don’t need tight polling all the time to get the performance gains provided by DPDK or similar environments. The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time. An interesting, helpful research effort you could lead would be to survey the ecosystem to catalog those applications that do fall into the power hungry category and help them to change their code.
Intel DPDK application development guidelines don’t pre-suppose tight polling all the time and offer at least two methods for optimizing power against throughput. The older method is to use adaptive polling; increasing the polling frequency as traffic load increases. This keeps cpu utilization low when packet load is light and increases it as traffic levels warrant. The second method is to use P-states and/or C-states to put the processor into lower power modes when traffic loads are lighter. We have found that adaptive polling works better across a larger pool of hardware types, and therefore that is what DANOS uses, amongst other things.
Further, performance and power consumption are dictated by a multivariate set of application decisions including: design patterns such as single thread run to completion models vs. passing mbufs between multiple threads, buffer sizes and cache management algorithms, combining and/or separating tx/rx threads, binding threads to specific lcores, reserved cores for DPDK threads, hyperthreading, kernel schedulers, hypervisor schedulers, interface drivers, etc. All of these are application specific, not DPDK generic. Well written applications that leverage DPDK provide knobs for the user to tune these settings for their specific environment and use case. None of this unique to DPDK. Solution designs were cribbed from previous technologies.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power bills. Power consumption is dictated by the application. Look for well behaved applications and everything will be alright.
If you have questions, I’d be happy to discuss off line.
Thanks, Robert.
On Feb 22, 2021, at 11:27 PM, Etienne-Victor Depasquale <edepa@ieee.org <mailto:edepa@ieee.org>> wrote:
Sorry, last line should have been: "intended to get an impression of how widespread ***knowledge of*** DPDK's core operating inefficiency is", not: "intended to get an impression of how widespread DPDK's core operating inefficiency is"
On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale <edepa@ieee.org <mailto:edepa@ieee.org>> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour.
Let me know what you think; otherwise, I'm pretty confident that DPDK does: "keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org <mailto:robert@gdk.org>> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net <mailto:jared@compuwizz.net>> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale <edepa@ieee.org <mailto:edepa@ieee.org>> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY <https://www.surveymonkey.com/r/J886DPY>.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale <https://www.um.edu.mt/profile/etiennedepasquale>
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale <https://www.um.edu.mt/profile/etiennedepasquale>
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale <https://www.um.edu.mt/profile/etiennedepasquale>
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale <https://www.um.edu.mt/profile/etiennedepasquale>
I think I need to calm this thread down. I'm a researcher, and my interest is in the truth, not in my opinion. I've read some facts in this thread that are necessary as a prerequisite to the publication of the results on Friday. I do want to ensure that no future reader is misinformed and will do my best, with the help of contribution from my peers in this good community, to summarize all objections to this survey's questions, in the same message as that which publishes the result. All peace and good wishes, Etienne On Wed, Feb 24, 2021 at 4:35 PM Robert Bays <robert@gdk.org> wrote:
To the nanog community, I’m sorry to have dragged this conversation out further. I'm only responding to this because there are a significant number of open source projects and commercial products that use DPDK, or similar userspace network environment in their implementations. The statements in this thread incorrectly cast them, because they use DPDK, as inefficient. But the reality is they have all been designed from day one not to unnecessarily consume power. Please ask your open source dev and/or vendor of choice to verify. But please don’t rely on the information in this thread to make decisions about what you deploy in your network.
On Feb 23, 2021, at 11:44 PM, Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Hello Robert,
Your statement that DPDK “keeps utilization at 100% regardless of packet
activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion.
This statement is incorrect. I have provided references (please see earlier e-mails) that investigate the operation of DPDK. These references are items of peer-reviewed research that investigate a perceived problem with deployment of DPDK. If the power consumption incurred while running DPDK were a corner case, then there would be little to no research value in investigating such behavior.
Your references don’t take into account the code that this community would actually deploy; open source implementations like DANOS, FD.io, or OVS. They don’t audit any commercial products that implement userspace stacks. None of your references say that DPDK is inherently inefficient. The closest they come is to say that tight polling is inefficient. But tight polling, even in the earliest days of DPDK, was never meant to be a design pattern that was actually deployed into production. I was there for those early conversations.
Please don’t mislead the community into believing that DPDK == power bad
I have to object to this statement. It does seem to imply malice, or, at best, amateurish behaviour, whether you intended it or not.
Object all you want. You are misleading people with your comments. And in the process you are denigrating a large swath of OSS projects and commercial products that use DPDK. Your survey questions are leading and provide a false dichotomy. And when you post the results here, they will be archived forever to continue to spread misinformation, unfortunately.
Everything following is informational. Stop here if so inclined.
Please stop delving into the detail of DPDK's facilities without regard for your logical omission: that whether the facilities are available or not, DPDK's deployment profile (meaning: how it's being used in general), as indicated by the references I've provided, are leading to high power inefficiency on cores partitioned to the data plane.
I’ve been writing network appliance code for over 20 years. I designed network architectures for years before that. I have 10s of thousands of DPDK based appliances in production at this moment across multiple different use cases. I work with companies that have 100s of thousands of units in production that leverage userspace runtimes. I do think I understand DPDK’s deployment profile better than you. That’s what I have been trying to tell you. People don’t write inefficient DPDK code to put into production. We’re not dumb. We’ve been thinking about power consumption from day one. DPDK was never supposed to be just a tight loop poll. You were always supposed to put in the very minimal extra work to modulate power consumption.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power
bills.
Of course it doesn't. Even the second question of that bare-bones survey tried to communicate this much.
If you have questions, I’d be happy to discuss off line
I would be happy to answer your objections in detail off line too. Just let me know.
Unfortunately, you don’t seem to be receptive to the numerous people contradicting your assertions. So I’m out. I’ll let my comments stand here.
Cheers,
Etienne
On Wed, Feb 24, 2021 at 12:12 AM Robert Bays <robert@gdk.org> wrote:
Hi Etienne,
Your statement that DPDK “keeps utilization at 100% regardless of packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion. Your statements, taken at face value, lead people to believe that if a project uses DPDK it’s going to increase their power costs. And that’s just not the case. Please don’t mislead the community into believing that DPDK == power bad.
Everything following is informational. Stop here if so inclined.
DPDK does not dictate CPU utilization or power consumption, the application leveraging DPDK does. It’s the application that decides how to poll packets. If an application implements DPDK using only a tight polling loop, then it will keep CPU cores that are running DPDK threads at 100%. But only the most simple and/or bespoke (think trading) applications are implemented this way. You don’t need tight polling all the time to get the performance gains provided by DPDK or similar environments. The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time. An interesting, helpful research effort you could lead would be to survey the ecosystem to catalog those applications that do fall into the power hungry category and help them to change their code.
Intel DPDK application development guidelines don’t pre-suppose tight polling all the time and offer at least two methods for optimizing power against throughput. The older method is to use adaptive polling; increasing the polling frequency as traffic load increases. This keeps cpu utilization low when packet load is light and increases it as traffic levels warrant. The second method is to use P-states and/or C-states to put the processor into lower power modes when traffic loads are lighter. We have found that adaptive polling works better across a larger pool of hardware types, and therefore that is what DANOS uses, amongst other things.
Further, performance and power consumption are dictated by a multivariate set of application decisions including: design patterns such as single thread run to completion models vs. passing mbufs between multiple threads, buffer sizes and cache management algorithms, combining and/or separating tx/rx threads, binding threads to specific lcores, reserved cores for DPDK threads, hyperthreading, kernel schedulers, hypervisor schedulers, interface drivers, etc. All of these are application specific, not DPDK generic. Well written applications that leverage DPDK provide knobs for the user to tune these settings for their specific environment and use case. None of this unique to DPDK. Solution designs were cribbed from previous technologies.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power bills. Power consumption is dictated by the application. Look for well behaved applications and everything will be alright.
If you have questions, I’d be happy to discuss off line.
Thanks, Robert.
On Feb 22, 2021, at 11:27 PM, Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Sorry, last line should have been: "intended to get an impression of how widespread ***knowledge of*** DPDK's core operating inefficiency is", not: "intended to get an impression of how widespread DPDK's core operating inefficiency is"
On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour.
Let me know what you think; otherwise, I'm pretty confident that DPDK does: "keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Just a quick note to say that I've closed the survey. I haven't published the results yet as I said that I would write notes necessary as a preamble to correctly inform potential readers, and these notes are taking longer to write than I have time available. Cheers, Etienne On Wed, Feb 24, 2021 at 7:07 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
I think I need to calm this thread down.
I'm a researcher, and my interest is in the truth, not in my opinion.
I've read some facts in this thread that are necessary as a prerequisite to the publication of the results on Friday.
I do want to ensure that no future reader is misinformed and will do my best, with the help of contribution from my peers in this good community, to summarize all objections to this survey's questions, in the same message as that which publishes the result.
All peace and good wishes,
Etienne
On Wed, Feb 24, 2021 at 4:35 PM Robert Bays <robert@gdk.org> wrote:
To the nanog community, I’m sorry to have dragged this conversation out further. I'm only responding to this because there are a significant number of open source projects and commercial products that use DPDK, or similar userspace network environment in their implementations. The statements in this thread incorrectly cast them, because they use DPDK, as inefficient. But the reality is they have all been designed from day one not to unnecessarily consume power. Please ask your open source dev and/or vendor of choice to verify. But please don’t rely on the information in this thread to make decisions about what you deploy in your network.
On Feb 23, 2021, at 11:44 PM, Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Hello Robert,
Your statement that DPDK “keeps utilization at 100% regardless of packet
activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion.
This statement is incorrect. I have provided references (please see earlier e-mails) that investigate the operation of DPDK. These references are items of peer-reviewed research that investigate a perceived problem with deployment of DPDK. If the power consumption incurred while running DPDK were a corner case, then there would be little to no research value in investigating such behavior.
Your references don’t take into account the code that this community would actually deploy; open source implementations like DANOS, FD.io, or OVS. They don’t audit any commercial products that implement userspace stacks. None of your references say that DPDK is inherently inefficient. The closest they come is to say that tight polling is inefficient. But tight polling, even in the earliest days of DPDK, was never meant to be a design pattern that was actually deployed into production. I was there for those early conversations.
Please don’t mislead the community into believing that DPDK == power bad
I have to object to this statement. It does seem to imply malice, or, at best, amateurish behaviour, whether you intended it or not.
Object all you want. You are misleading people with your comments. And in the process you are denigrating a large swath of OSS projects and commercial products that use DPDK. Your survey questions are leading and provide a false dichotomy. And when you post the results here, they will be archived forever to continue to spread misinformation, unfortunately.
Everything following is informational. Stop here if so inclined.
Please stop delving into the detail of DPDK's facilities without regard for your logical omission: that whether the facilities are available or not, DPDK's deployment profile (meaning: how it's being used in general), as indicated by the references I've provided, are leading to high power inefficiency on cores partitioned to the data plane.
I’ve been writing network appliance code for over 20 years. I designed network architectures for years before that. I have 10s of thousands of DPDK based appliances in production at this moment across multiple different use cases. I work with companies that have 100s of thousands of units in production that leverage userspace runtimes. I do think I understand DPDK’s deployment profile better than you. That’s what I have been trying to tell you. People don’t write inefficient DPDK code to put into production. We’re not dumb. We’ve been thinking about power consumption from day one. DPDK was never supposed to be just a tight loop poll. You were always supposed to put in the very minimal extra work to modulate power consumption.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power
bills.
Of course it doesn't. Even the second question of that bare-bones survey tried to communicate this much.
If you have questions, I’d be happy to discuss off line
I would be happy to answer your objections in detail off line too. Just let me know.
Unfortunately, you don’t seem to be receptive to the numerous people contradicting your assertions. So I’m out. I’ll let my comments stand here.
Cheers,
Etienne
On Wed, Feb 24, 2021 at 12:12 AM Robert Bays <robert@gdk.org> wrote:
Hi Etienne,
Your statement that DPDK “keeps utilization at 100% regardless of packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion. Your statements, taken at face value, lead people to believe that if a project uses DPDK it’s going to increase their power costs. And that’s just not the case. Please don’t mislead the community into believing that DPDK == power bad.
Everything following is informational. Stop here if so inclined.
DPDK does not dictate CPU utilization or power consumption, the application leveraging DPDK does. It’s the application that decides how to poll packets. If an application implements DPDK using only a tight polling loop, then it will keep CPU cores that are running DPDK threads at 100%. But only the most simple and/or bespoke (think trading) applications are implemented this way. You don’t need tight polling all the time to get the performance gains provided by DPDK or similar environments. The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time. An interesting, helpful research effort you could lead would be to survey the ecosystem to catalog those applications that do fall into the power hungry category and help them to change their code.
Intel DPDK application development guidelines don’t pre-suppose tight polling all the time and offer at least two methods for optimizing power against throughput. The older method is to use adaptive polling; increasing the polling frequency as traffic load increases. This keeps cpu utilization low when packet load is light and increases it as traffic levels warrant. The second method is to use P-states and/or C-states to put the processor into lower power modes when traffic loads are lighter. We have found that adaptive polling works better across a larger pool of hardware types, and therefore that is what DANOS uses, amongst other things.
Further, performance and power consumption are dictated by a multivariate set of application decisions including: design patterns such as single thread run to completion models vs. passing mbufs between multiple threads, buffer sizes and cache management algorithms, combining and/or separating tx/rx threads, binding threads to specific lcores, reserved cores for DPDK threads, hyperthreading, kernel schedulers, hypervisor schedulers, interface drivers, etc. All of these are application specific, not DPDK generic. Well written applications that leverage DPDK provide knobs for the user to tune these settings for their specific environment and use case. None of this unique to DPDK. Solution designs were cribbed from previous technologies.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power bills. Power consumption is dictated by the application. Look for well behaved applications and everything will be alright.
If you have questions, I’d be happy to discuss off line.
Thanks, Robert.
On Feb 22, 2021, at 11:27 PM, Etienne-Victor Depasquale < edepa@ieee.org> wrote:
Sorry, last line should have been: "intended to get an impression of how widespread ***knowledge of*** DPDK's core operating inefficiency is", not: "intended to get an impression of how widespread DPDK's core operating inefficiency is"
On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour.
Let me know what you think; otherwise, I'm pretty confident that DPDK does: "keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
*TL;DR - DPDK applications embody the phrase caveat emptor.* As Robert Bays put it: "Please ask your open source dev and/or vendor of choice to verify." On the other hand, I do not recommend taking the following (citing Robert Bays again) for granted: "But the reality is [open source projects and commercial products] have all been designed from day one not to unnecessarily consume power." This note is presented in two sections. Section 1 presents the preamble necessary to avoid misinformation. Section 2 presents the survey. If so inclined, please read on. *SECTION 1* There are three issues at stake: 1. the ground truth about the power/energy efficiency of (current) deployments that use DPDK, 2. my choice of words for the first question, as this constitutes the claimed source of misinformation, and 3. apportionment of responsibility for the attained level of power/energy efficiency of a deployment that uses DPDK, *Issue #1: ground truth on current deployments* I base on (a) research papers and (b) Pawel Malachowski's data. Numbered references are listed at the end of this e-mail. [1] investigates software data planes, including OvS-DPDK. Citing directly: "DPDK-OVS always works with high power consumption even when [there is] no traffic to handle. Considering the inefficiency [][in] power, DPDK provides power management APIs to compromise between power consumption and performance." "For DPDK-OVS, due to the feature of DPDK’s Polling Mode Driver (PMD), once the first DPDK port is added to vswitchd process, it creates a polling thread and polls DPDK device in continuous loop. Therefore CPU utilization for that thread is always 100%, and the power consumption r[]ises to about 138 Watt" [2] investigates multimedia content delivery and benchmarks *DPDK-OvS* in the process. Citing directly: "Even when no traffic was in transit, OvS-DPDK consumed approximately three times more energy than the other two data planes, adding 250 percent energy overhead (15.57 W) on top of the host OS." [3] proposes the use of ACPI P-states and the halt instruction to control power consumption, in the context of *a bespoke application*. Citing directly: "For example, a Xeon(R) E5-2620 v3 dual socket CPU consumes about 22W of power when it is idle; but if a DPDK-based software router runs on it, the CPU power soars to 83W even when no packets arrive. That is a power gap of more than 60W." [4] investigates the energy-efficient use of *Pktgen-DPDK*. Citing directly: "We find that high performance comes at the cost of high energy consumption." Pawel Malachowski shows a list of cores (13 out of 16) in use by a DPDK application ("DPDK-based 100G DDoS scrubber currently lifting some low traffic using cores 1-13 on 16 core host. It uses naive DPDK::rte_pause() throttling to enter C1"). The list shows the cores spending most of their time in C1. This means that cores are in a low-power-idle state and therefore not in an active (C0) state. This shows a power-aware DPDK application. *Issue #2: my choice of words, as a source of misinformation* Issue has been taken with the text of question 1. I addressed this to the NANOG community, who are busy and knowledgeable. I chose, *with hindsight wrongly*, to paraphrase, with the expectation that a reader would interpret correctly. A better expression, that would still have been terse, would be: "Are you aware that *naïve* use of DPDK on a processor core keeps utilization at 100% regardless of packet activity?" *Issue #3: apportionment of responsibility for the attained level of power/energy efficiency of a deployment that uses DPDK* Pawel Malachowski states that "It consumes 100% only if you busy poll (which is the default approach)." Since it is the application that exploits the DPDK API, and since the DPDK API promotes run-to-completion ( https://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html), then *it is the application that determines power consumption* but it is DPDK's poll-mode driver *that poses a real threat to power efficiency, if used in "the default approach".* Robert Bays states: "The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time." *Would this audience (an audience of network operators) **truly not be interested in using OvS-DPDK ?* *Caveat emptor.* *SECTION 2: Survey results* *Q1* [image: image.png] *Q2* [image: image.png] [1] Z. Xu, F. Liu, T. Wang, and H. Xu, “Demystifying the energy efficiency of Network Function Virtualization,” in 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS), Jun. 2016, pp. 1–10. DOI: 10.1109/IWQoS.2016.7590429. [2] S. Fu, J. Liu, and W. Zhu, “Multimedia Content Delivery with Network Function Virtualization: The Energy Perspective,” IEEE MultiMedia, vol. 24, no. 3, pp. 38–47, 2017, ISSN: 1941-0166. DOI: 10.1109/MMUL.2017.3051514. [3] X. Li, W. Cheng, T. Zhang, F. Ren, and B. Yang, “Towards Power Efficient High Performance Packet I/O,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 4, pp. 981–996, April 2020, ISSN:1558-2183. DOI: 10.1109/TPDS.2019.2957746. [4] G. Li, D. Zhang, Y. Li, and K. Li, “Toward energy efficiency optimization of pktgen-DPDK for green network testbeds,” China Communications, vol. 15, no. 11, pp. 199–207, November 2018, ISSN: 1673-5447. DOI: 10.1109/CC.2018.8543100. On Sat, Feb 27, 2021 at 5:11 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Just a quick note to say that I've closed the survey.
I haven't published the results yet as I said that I would write notes necessary as a preamble to correctly inform potential readers, and these notes are taking longer to write than I have time available.
Cheers,
Etienne
On Wed, Feb 24, 2021 at 7:07 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
I think I need to calm this thread down.
I'm a researcher, and my interest is in the truth, not in my opinion.
I've read some facts in this thread that are necessary as a prerequisite to the publication of the results on Friday.
I do want to ensure that no future reader is misinformed and will do my best, with the help of contribution from my peers in this good community, to summarize all objections to this survey's questions, in the same message as that which publishes the result.
All peace and good wishes,
Etienne
On Wed, Feb 24, 2021 at 4:35 PM Robert Bays <robert@gdk.org> wrote:
To the nanog community, I’m sorry to have dragged this conversation out further. I'm only responding to this because there are a significant number of open source projects and commercial products that use DPDK, or similar userspace network environment in their implementations. The statements in this thread incorrectly cast them, because they use DPDK, as inefficient. But the reality is they have all been designed from day one not to unnecessarily consume power. Please ask your open source dev and/or vendor of choice to verify. But please don’t rely on the information in this thread to make decisions about what you deploy in your network.
On Feb 23, 2021, at 11:44 PM, Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Hello Robert,
Your statement that DPDK “keeps utilization at 100% regardless of packet
activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion.
This statement is incorrect. I have provided references (please see earlier e-mails) that investigate the operation of DPDK. These references are items of peer-reviewed research that investigate a perceived problem with deployment of DPDK. If the power consumption incurred while running DPDK were a corner case, then there would be little to no research value in investigating such behavior.
Your references don’t take into account the code that this community would actually deploy; open source implementations like DANOS, FD.io, or OVS. They don’t audit any commercial products that implement userspace stacks. None of your references say that DPDK is inherently inefficient. The closest they come is to say that tight polling is inefficient. But tight polling, even in the earliest days of DPDK, was never meant to be a design pattern that was actually deployed into production. I was there for those early conversations.
Please don’t mislead the community into believing that DPDK == power bad
I have to object to this statement. It does seem to imply malice, or, at best, amateurish behaviour, whether you intended it or not.
Object all you want. You are misleading people with your comments. And in the process you are denigrating a large swath of OSS projects and commercial products that use DPDK. Your survey questions are leading and provide a false dichotomy. And when you post the results here, they will be archived forever to continue to spread misinformation, unfortunately.
Everything following is informational. Stop here if so inclined.
Please stop delving into the detail of DPDK's facilities without regard for your logical omission: that whether the facilities are available or not, DPDK's deployment profile (meaning: how it's being used in general), as indicated by the references I've provided, are leading to high power inefficiency on cores partitioned to the data plane.
I’ve been writing network appliance code for over 20 years. I designed network architectures for years before that. I have 10s of thousands of DPDK based appliances in production at this moment across multiple different use cases. I work with companies that have 100s of thousands of units in production that leverage userspace runtimes. I do think I understand DPDK’s deployment profile better than you. That’s what I have been trying to tell you. People don’t write inefficient DPDK code to put into production. We’re not dumb. We’ve been thinking about power consumption from day one. DPDK was never supposed to be just a tight loop poll. You were always supposed to put in the very minimal extra work to modulate power consumption.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power
bills.
Of course it doesn't. Even the second question of that bare-bones survey tried to communicate this much.
If you have questions, I’d be happy to discuss off line
I would be happy to answer your objections in detail off line too. Just let me know.
Unfortunately, you don’t seem to be receptive to the numerous people contradicting your assertions. So I’m out. I’ll let my comments stand here.
Cheers,
Etienne
On Wed, Feb 24, 2021 at 12:12 AM Robert Bays <robert@gdk.org> wrote:
Hi Etienne,
Your statement that DPDK “keeps utilization at 100% regardless of packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion. Your statements, taken at face value, lead people to believe that if a project uses DPDK it’s going to increase their power costs. And that’s just not the case. Please don’t mislead the community into believing that DPDK == power bad.
Everything following is informational. Stop here if so inclined.
DPDK does not dictate CPU utilization or power consumption, the application leveraging DPDK does. It’s the application that decides how to poll packets. If an application implements DPDK using only a tight polling loop, then it will keep CPU cores that are running DPDK threads at 100%. But only the most simple and/or bespoke (think trading) applications are implemented this way. You don’t need tight polling all the time to get the performance gains provided by DPDK or similar environments. The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time. An interesting, helpful research effort you could lead would be to survey the ecosystem to catalog those applications that do fall into the power hungry category and help them to change their code.
Intel DPDK application development guidelines don’t pre-suppose tight polling all the time and offer at least two methods for optimizing power against throughput. The older method is to use adaptive polling; increasing the polling frequency as traffic load increases. This keeps cpu utilization low when packet load is light and increases it as traffic levels warrant. The second method is to use P-states and/or C-states to put the processor into lower power modes when traffic loads are lighter. We have found that adaptive polling works better across a larger pool of hardware types, and therefore that is what DANOS uses, amongst other things.
Further, performance and power consumption are dictated by a multivariate set of application decisions including: design patterns such as single thread run to completion models vs. passing mbufs between multiple threads, buffer sizes and cache management algorithms, combining and/or separating tx/rx threads, binding threads to specific lcores, reserved cores for DPDK threads, hyperthreading, kernel schedulers, hypervisor schedulers, interface drivers, etc. All of these are application specific, not DPDK generic. Well written applications that leverage DPDK provide knobs for the user to tune these settings for their specific environment and use case. None of this unique to DPDK. Solution designs were cribbed from previous technologies.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power bills. Power consumption is dictated by the application. Look for well behaved applications and everything will be alright.
If you have questions, I’d be happy to discuss off line.
Thanks, Robert.
On Feb 22, 2021, at 11:27 PM, Etienne-Victor Depasquale < edepa@ieee.org> wrote:
Sorry, last line should have been: "intended to get an impression of how widespread ***knowledge of*** DPDK's core operating inefficiency is", not: "intended to get an impression of how widespread DPDK's core operating inefficiency is"
On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour.
Let me know what you think; otherwise, I'm pretty confident that DPDK does: "keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote:
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
A great deal of this discussion could be resolved by the use of a $20 in-line 120VAC watt meter [1] plugged into something as simple as a $500 1U server with some of the DPDK-enabled network cards connected to its PCI-E bus, running DANOS. Characterizing the idle load, average usage load, and absolute maximum wattage load of an x86-64 platform is excessively difficult or complicated. [1] https://www.homedepot.com/p/Kill-A-Watt-Electricity-Monitor-P4400/202196386 On Thu, Mar 4, 2021 at 11:28 AM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
*TL;DR - DPDK applications embody the phrase caveat emptor.*
As Robert Bays put it: "Please ask your open source dev and/or vendor of choice to verify." On the other hand, I do not recommend taking the following (citing Robert Bays again) for granted: "But the reality is [open source projects and commercial products] have all been designed from day one not to unnecessarily consume power."
This note is presented in two sections. Section 1 presents the preamble necessary to avoid misinformation. Section 2 presents the survey.
If so inclined, please read on.
*SECTION 1* There are three issues at stake:
1. the ground truth about the power/energy efficiency of (current) deployments that use DPDK, 2. my choice of words for the first question, as this constitutes the claimed source of misinformation, and 3. apportionment of responsibility for the attained level of power/energy efficiency of a deployment that uses DPDK,
*Issue #1: ground truth on current deployments* I base on (a) research papers and (b) Pawel Malachowski's data. Numbered references are listed at the end of this e-mail.
[1] investigates software data planes, including OvS-DPDK. Citing directly: "DPDK-OVS always works with high power consumption even when [there is] no traffic to handle. Considering the inefficiency [][in] power, DPDK provides power management APIs to compromise between power consumption and performance." "For DPDK-OVS, due to the feature of DPDK’s Polling Mode Driver (PMD), once the first DPDK port is added to vswitchd process, it creates a polling thread and polls DPDK device in continuous loop. Therefore CPU utilization for that thread is always 100%, and the power consumption r[]ises to about 138 Watt"
[2] investigates multimedia content delivery and benchmarks *DPDK-OvS* in the process. Citing directly: "Even when no traffic was in transit, OvS-DPDK consumed approximately three times more energy than the other two data planes, adding 250 percent energy overhead (15.57 W) on top of the host OS."
[3] proposes the use of ACPI P-states and the halt instruction to control power consumption, in the context of *a bespoke application*. Citing directly: "For example, a Xeon(R) E5-2620 v3 dual socket CPU consumes about 22W of power when it is idle; but if a DPDK-based software router runs on it, the CPU power soars to 83W even when no packets arrive. That is a power gap of more than 60W."
[4] investigates the energy-efficient use of *Pktgen-DPDK*. Citing directly: "We find that high performance comes at the cost of high energy consumption."
Pawel Malachowski shows a list of cores (13 out of 16) in use by a DPDK application ("DPDK-based 100G DDoS scrubber currently lifting some low traffic using cores 1-13 on 16 core host. It uses naive DPDK::rte_pause() throttling to enter C1"). The list shows the cores spending most of their time in C1. This means that cores are in a low-power-idle state and therefore not in an active (C0) state. This shows a power-aware DPDK application.
*Issue #2: my choice of words, as a source of misinformation* Issue has been taken with the text of question 1. I addressed this to the NANOG community, who are busy and knowledgeable. I chose, *with hindsight wrongly*, to paraphrase, with the expectation that a reader would interpret correctly. A better expression, that would still have been terse, would be: "Are you aware that *naïve* use of DPDK on a processor core keeps utilization at 100% regardless of packet activity?"
*Issue #3: apportionment of responsibility for the attained level of power/energy efficiency of a deployment that uses DPDK* Pawel Malachowski states that "It consumes 100% only if you busy poll (which is the default approach)."
Since it is the application that exploits the DPDK API, and since the DPDK API promotes run-to-completion ( https://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html), then *it is the application that determines power consumption* but it is DPDK's poll-mode driver *that poses a real threat to power efficiency, if used in "the default approach".*
Robert Bays states: "The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time." *Would this audience (an audience of network operators) **truly not be interested in using OvS-DPDK ?* *Caveat emptor.*
*SECTION 2: Survey results* *Q1* [image: image.png] *Q2* [image: image.png]
[1] Z. Xu, F. Liu, T. Wang, and H. Xu, “Demystifying the energy efficiency of Network Function Virtualization,” in 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS), Jun. 2016, pp. 1–10. DOI: 10.1109/IWQoS.2016.7590429.
[2] S. Fu, J. Liu, and W. Zhu, “Multimedia Content Delivery with Network Function Virtualization: The Energy Perspective,” IEEE MultiMedia, vol. 24, no. 3, pp. 38–47, 2017, ISSN: 1941-0166. DOI: 10.1109/MMUL.2017.3051514.
[3] X. Li, W. Cheng, T. Zhang, F. Ren, and B. Yang, “Towards Power Efficient High Performance Packet I/O,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 4, pp. 981–996, April 2020, ISSN:1558-2183. DOI: 10.1109/TPDS.2019.2957746.
[4] G. Li, D. Zhang, Y. Li, and K. Li, “Toward energy efficiency optimization of pktgen-DPDK for green network testbeds,” China Communications, vol. 15, no. 11, pp. 199–207, November 2018, ISSN: 1673-5447. DOI: 10.1109/CC.2018.8543100.
On Sat, Feb 27, 2021 at 5:11 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Just a quick note to say that I've closed the survey.
I haven't published the results yet as I said that I would write notes necessary as a preamble to correctly inform potential readers, and these notes are taking longer to write than I have time available.
Cheers,
Etienne
On Wed, Feb 24, 2021 at 7:07 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
I think I need to calm this thread down.
I'm a researcher, and my interest is in the truth, not in my opinion.
I've read some facts in this thread that are necessary as a prerequisite to the publication of the results on Friday.
I do want to ensure that no future reader is misinformed and will do my best, with the help of contribution from my peers in this good community, to summarize all objections to this survey's questions, in the same message as that which publishes the result.
All peace and good wishes,
Etienne
On Wed, Feb 24, 2021 at 4:35 PM Robert Bays <robert@gdk.org> wrote:
To the nanog community, I’m sorry to have dragged this conversation out further. I'm only responding to this because there are a significant number of open source projects and commercial products that use DPDK, or similar userspace network environment in their implementations. The statements in this thread incorrectly cast them, because they use DPDK, as inefficient. But the reality is they have all been designed from day one not to unnecessarily consume power. Please ask your open source dev and/or vendor of choice to verify. But please don’t rely on the information in this thread to make decisions about what you deploy in your network.
On Feb 23, 2021, at 11:44 PM, Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Hello Robert,
Your statement that DPDK “keeps utilization at 100% regardless of
packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion.
This statement is incorrect. I have provided references (please see earlier e-mails) that investigate the operation of DPDK. These references are items of peer-reviewed research that investigate a perceived problem with deployment of DPDK. If the power consumption incurred while running DPDK were a corner case, then there would be little to no research value in investigating such behavior.
Your references don’t take into account the code that this community would actually deploy; open source implementations like DANOS, FD.io, or OVS. They don’t audit any commercial products that implement userspace stacks. None of your references say that DPDK is inherently inefficient. The closest they come is to say that tight polling is inefficient. But tight polling, even in the earliest days of DPDK, was never meant to be a design pattern that was actually deployed into production. I was there for those early conversations.
Please don’t mislead the community into believing that DPDK == power bad
I have to object to this statement. It does seem to imply malice, or, at best, amateurish behaviour, whether you intended it or not.
Object all you want. You are misleading people with your comments. And in the process you are denigrating a large swath of OSS projects and commercial products that use DPDK. Your survey questions are leading and provide a false dichotomy. And when you post the results here, they will be archived forever to continue to spread misinformation, unfortunately.
Everything following is informational. Stop here if so inclined.
Please stop delving into the detail of DPDK's facilities without regard for your logical omission: that whether the facilities are available or not, DPDK's deployment profile (meaning: how it's being used in general), as indicated by the references I've provided, are leading to high power inefficiency on cores partitioned to the data plane.
I’ve been writing network appliance code for over 20 years. I designed network architectures for years before that. I have 10s of thousands of DPDK based appliances in production at this moment across multiple different use cases. I work with companies that have 100s of thousands of units in production that leverage userspace runtimes. I do think I understand DPDK’s deployment profile better than you. That’s what I have been trying to tell you. People don’t write inefficient DPDK code to put into production. We’re not dumb. We’ve been thinking about power consumption from day one. DPDK was never supposed to be just a tight loop poll. You were always supposed to put in the very minimal extra work to modulate power consumption.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power
bills.
Of course it doesn't. Even the second question of that bare-bones survey tried to communicate this much.
If you have questions, I’d be happy to discuss off line
I would be happy to answer your objections in detail off line too. Just let me know.
Unfortunately, you don’t seem to be receptive to the numerous people contradicting your assertions. So I’m out. I’ll let my comments stand here.
Cheers,
Etienne
On Wed, Feb 24, 2021 at 12:12 AM Robert Bays <robert@gdk.org> wrote:
Hi Etienne,
Your statement that DPDK “keeps utilization at 100% regardless of packet activity” is just not correct. You further pre-suppose "widespread DPDK's core operating inefficiency” without any data to backup the operating inefficacy assertion. Your statements, taken at face value, lead people to believe that if a project uses DPDK it’s going to increase their power costs. And that’s just not the case. Please don’t mislead the community into believing that DPDK == power bad.
Everything following is informational. Stop here if so inclined.
DPDK does not dictate CPU utilization or power consumption, the application leveraging DPDK does. It’s the application that decides how to poll packets. If an application implements DPDK using only a tight polling loop, then it will keep CPU cores that are running DPDK threads at 100%. But only the most simple and/or bespoke (think trading) applications are implemented this way. You don’t need tight polling all the time to get the performance gains provided by DPDK or similar environments. The vast majority of applications that this audience would actually install in their networks do not do tight polling all the time and therefore don’t consume 100% of the CPU all the time. An interesting, helpful research effort you could lead would be to survey the ecosystem to catalog those applications that do fall into the power hungry category and help them to change their code.
Intel DPDK application development guidelines don’t pre-suppose tight polling all the time and offer at least two methods for optimizing power against throughput. The older method is to use adaptive polling; increasing the polling frequency as traffic load increases. This keeps cpu utilization low when packet load is light and increases it as traffic levels warrant. The second method is to use P-states and/or C-states to put the processor into lower power modes when traffic loads are lighter. We have found that adaptive polling works better across a larger pool of hardware types, and therefore that is what DANOS uses, amongst other things.
Further, performance and power consumption are dictated by a multivariate set of application decisions including: design patterns such as single thread run to completion models vs. passing mbufs between multiple threads, buffer sizes and cache management algorithms, combining and/or separating tx/rx threads, binding threads to specific lcores, reserved cores for DPDK threads, hyperthreading, kernel schedulers, hypervisor schedulers, interface drivers, etc. All of these are application specific, not DPDK generic. Well written applications that leverage DPDK provide knobs for the user to tune these settings for their specific environment and use case. None of this unique to DPDK. Solution designs were cribbed from previous technologies.
The takeaway is that DPDK (and similar) doesn’t guarantee runaway power bills. Power consumption is dictated by the application. Look for well behaved applications and everything will be alright.
If you have questions, I’d be happy to discuss off line.
Thanks, Robert.
On Feb 22, 2021, at 11:27 PM, Etienne-Victor Depasquale < edepa@ieee.org> wrote:
Sorry, last line should have been: "intended to get an impression of how widespread ***knowledge of*** DPDK's core operating inefficiency is", not: "intended to get an impression of how widespread DPDK's core operating inefficiency is"
On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale < edepa@ieee.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour.
Let me know what you think; otherwise, I'm pretty confident that DPDK does: "keep utilization at 100% regardless of packet activity.”
Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) - not a detailed breakdown of modes of operation of DPDK or DANOS. DPDK has been designed for fast I/O that's unencumbered by the trappings of general-purpose OSes, and that's the impression that needs to be forefront. Power control, as well as any other dimensions of modulation, are detailed modes of operation that are well beyond the scope of a bare-bones 2-question survey intended to get an impression of how widespread DPDK's core operating inefficiency is.
Cheers,
Etienne
On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert@gdk.org> wrote: Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.” Adaptive polling changes in DPDK optimize for tradeoffs between power consumption, latency/jitter and drops during throughput ramp up periods. Ideally your DPDK implementation has an algorithm that tries to automatically optimize based on current traffic patterns.
In DANOS refer to the “system default dataplane power-profile” config command tree for adaptive polling settings. Interface RX/TX affinity is configured on a per interface basis under the “interfaces dataplane” config command tree.
-robert
> On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared@compuwizz.net> wrote: > > DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization. > > I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself. > > I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc. > > ~Jared > > On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale < edepa@ieee.org> wrote: > Hello folks, > > I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK. > > As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK. > > I've drawn up a bare-bones, 2-question survey at this link: > > https://www.surveymonkey.com/r/J886DPY. > > Responses have been set to anonymous. > > Cheers, > > Etienne > > -- > Ing. Etienne-Victor Depasquale > Assistant Lecturer > Department of Communications & Computer Engineering > Faculty of Information & Communication Technology > University of Malta > Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
On 05/03/2021 00:26, Eric Kuhnke wrote:
A great deal of this discussion could be resolved by the use of a $20 in-line 120VAC watt meter [1] plugged into something as simple as a $500 1U server with some of the DPDK-enabled network cards connected to its PCI-E bus, running DANOS.
I'm fairly sure Etienne-Victor's email made specific reference to wattage measurements in both [2] and [3]. It would be fair to assume that the authors of those (IEEE) papers understood that you could measure wattage at the wall socket, before embarking on a paper regarding power efficiency.
Characterizing the idle load, average usage load, and absolute maximum wattage load of an x86-64 platform is excessively difficult or complicated.
It really isn't, particularly when the high figure is 400% of the low figure. You don't need milliwatt precision to see that your CPU is wasting power while not actually forwarding any packets. -- Tom
That was an unfortunate typo on my part, I meant to write "isn't excessively difficult..." Some real world examples of specific models of CPU + motherboard + PCI-E NIC combinations with wattage figures at idle load, average load and maximal load would be useful for comparison purposes. On Fri, Mar 5, 2021 at 8:09 AM Tom Hill <tom@ninjabadger.net> wrote:
On 05/03/2021 00:26, Eric Kuhnke wrote:
A great deal of this discussion could be resolved by the use of a $20 in-line 120VAC watt meter [1] plugged into something as simple as a $500 1U server with some of the DPDK-enabled network cards connected to its PCI-E bus, running DANOS.
I'm fairly sure Etienne-Victor's email made specific reference to wattage measurements in both [2] and [3]. It would be fair to assume that the authors of those (IEEE) papers understood that you could measure wattage at the wall socket, before embarking on a paper regarding power efficiency.
Characterizing the idle load, average usage load, and absolute maximum wattage load of an x86-64 platform is excessively difficult or complicated.
It really isn't, particularly when the high figure is 400% of the low figure. You don't need milliwatt precision to see that your CPU is wasting power while not actually forwarding any packets.
-- Tom
Sure, here goes: https://www.surveymonkey.com/results/SM-BJ9FCT6K9/ Cheers, Etienne On Fri, Mar 5, 2021 at 5:06 PM Tom Hill <tom@ninjabadger.net> wrote:
On 04/03/2021 18:20, Etienne-Victor Depasquale wrote:
*SECTION 2: Survey results*
I don't see the embedded images, and there's no way to show them inline. For the sake of simplicity/sharing, are these results presented anywhere on a web page? :)
Regards,
-- Tom
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
On 2021-03-05 12:22, Etienne-Victor Depasquale wrote:
Sure, here goes:
Thanks for sharing these results. We run DPDK workloads (Cisco nee Viptela vEdge Cloud) on ESXI. Fwiw, a quick survey of a few of our Dell R640s running mostly vEdge workloads shows the PS output wattage is about 60% higher than a non-vEdge workload: 420W vs 260W. PS input amperage is 2.0A@208V vs 1.4A, a 42% difference. Processor type is Xeon 6152. Stats obtained from the iDRAC lights-out management module. vEdge does not do any limiting of polling by default, and afaik the software has no support for any kind of limiting. It will poll the network driver on every core assigned to the VM for max performance, except for one core which is assigned to the control plane. I'm usually more concerned about the lack of available CPU cores. The CPU usage forces us not to oversubscribe the VM hosts, which means we must provision vEdges less densely and buy more gear sooner. Plus, the increased power demand means we can fit about 12 vEdge servers per cabinet instead of 17. (Power service is 30A 208V, maximum of 80% usage.) OTOH, I face far fewer questions about vEdge Cloud performance problems than I do on other virtual platforms.
Cheers,
Etienne
Thanks again, -Brian
For comparison purposes, I'm curious about the difference in wattage results between: a) Your R640 at 420W running DPDK b) The same R640 hardware temporarily booted from a Ubuntu server live USB, in which some common CPU stress and memory disk/IO benchmarks are being run to intentionally load the system to 100% to characterize its absolute maximum AC load wattage. https://packages.debian.org/search?keywords=stress https://packages.debian.org/search?keywords=stress-ng What's the delta between the 420W and absolute maximum load the server is capable of pulling on the 208VAC side? https://manpages.ubuntu.com/manpages/artful/man1/stress-ng.1.html One possible factor is whether ESXI is configured to pass the pci-e devices directly through to the guest VM, or if there is any abstraction in between. For non-ESXI stuff, in the world of Xen or KVM there's many different ways that a guest domU can access a dom0's network devices, some of which can have impact on overall steady-state wattage consumed by the system. If the greatest possible efficiency is desired for a number of 1U things, one thing to look at would be something similar to the open compute platform single centralized AC to DC power units, and servers that don't each have their own discrete 110-240VAC single or dual power supplies. In terms of cubic meters of air moved per hour vs wattage, the fans found in 1U servers are really quite inefficient. As a randomly chosen example of 12VDC 40mm (1U server height) fan: https://www.shoppui.com/documents/9HV0412P3K001.pdf If you have a single 12.0VDC fan that's a maximum load of 1.52A, that's a possible load of up to 18.24W for just *one* 40mm height fan. And your typical high speed dual socket 1U server may have up to eight or ten of those, in the typical front to back wind tunnel configuration. Normally fans won't be running at full speed, so each one won't be a 18W load, but more like 10-12W per fan is totally normal. Plus two at least two more fans in both hot swap power supplies. Under heavy load I would not be surprised at all to say that 80W to 90W of your R640's total 420W load is ventilation. In a situation where you're running out of power before you run out of rack space, look at some 1.5U and 2U high chassist that use 60mm height fans, which are much more efficient in ratio of air moved per time period vs watts. On Fri, Mar 5, 2021 at 12:44 PM Brian Knight via NANOG <nanog@nanog.org> wrote:
On 2021-03-05 12:22, Etienne-Victor Depasquale wrote:
Sure, here goes:
Thanks for sharing these results. We run DPDK workloads (Cisco nee Viptela vEdge Cloud) on ESXI. Fwiw, a quick survey of a few of our Dell R640s running mostly vEdge workloads shows the PS output wattage is about 60% higher than a non-vEdge workload: 420W vs 260W. PS input amperage is 2.0A@208V vs 1.4A, a 42% difference. Processor type is Xeon 6152. Stats obtained from the iDRAC lights-out management module.
vEdge does not do any limiting of polling by default, and afaik the software has no support for any kind of limiting. It will poll the network driver on every core assigned to the VM for max performance, except for one core which is assigned to the control plane.
I'm usually more concerned about the lack of available CPU cores. The CPU usage forces us not to oversubscribe the VM hosts, which means we must provision vEdges less densely and buy more gear sooner. Plus, the increased power demand means we can fit about 12 vEdge servers per cabinet instead of 17. (Power service is 30A 208V, maximum of 80% usage.)
OTOH, I face far fewer questions about vEdge Cloud performance problems than I do on other virtual platforms.
Cheers,
Etienne
Thanks again,
-Brian
On 2021-03-05 15:40, Eric Kuhnke wrote:
For comparison purposes, I'm curious about the difference in wattage results between:
a) Your R640 at 420W running DPDK
b) The same R640 hardware temporarily booted from a Ubuntu server live USB, in which some common CPU stress and memory disk/IO benchmarks are being run to intentionally load the system to 100% to characterize its absolute maximum AC load wattage.
We've got a few more hosts waiting to be deployed that are configured almost identically. I'll see what we can do. I'm guessing those tests would pull slightly more power than the vEdge hosts, just because there's not much disk IO that happens on a networking VM. These hosts have four SSDs for local storage.
What's the delta between the 420W and absolute maximum load the server is capable of pulling on the 208VAC side?
https://manpages.ubuntu.com/manpages/artful/man1/stress-ng.1.html
Server PS maximum input wattage is 900W. Present draw of 2.0A @ 208V is ~420W, so 420/900 = 46.67%
One possible factor is whether ESXI is configured to pass the pci-e devices directly through to the guest VM, or if there is any abstraction in between. For non-ESXI stuff, in the world of Xen or KVM there's many different ways that a guest domU can access a dom0's network devices, some of which can have impact on overall steady-state wattage consumed by the system.
The 420W server has its interfaces routed through the ESXI kernel. We're moving quickly to SR-IOV on new servers.
If the greatest possible efficiency is desired for a number of 1U things, one thing to look at would be something similar to the open compute platform single centralized AC to DC power units, and servers that don't each have their own discrete 110-240VAC single or dual power supplies. In terms of cubic meters of air moved per hour vs wattage, the fans found in 1U servers are really quite inefficient. As a randomly chosen example of 12VDC 40mm (1U server height) fan:
https://www.shoppui.com/documents/9HV0412P3K001.pdf
If you have a single 12.0VDC fan that's a maximum load of 1.52A, that's a possible load of up to 18.24W for just *one* 40mm height fan. And your typical high speed dual socket 1U server may have up to eight or ten of those, in the typical front to back wind tunnel configuration. Normally fans won't be running at full speed, so each one won't be a 18W load, but more like 10-12W per fan is totally normal. Plus two at least two more fans in both hot swap power supplies. Under heavy load I would not be surprised at all to say that 80W to 90W of your R640's total 420W load is ventilation.
Which is of course dependent on the environmentals. Fan speeds on our two servers are 25% for the 260W vs. 29% for 420W, so not much difference. Inlet temp on both is ~17C. I checked out another R640 heavily loaded with vEdge VMs, and it's pulling similar power, 415W, but the fan speed is at 45%, because inlet temp is 22C. The TDP for the Xeon 6152 is 140W, which seems middle-of-the-road. From the quick survey I did of Dell's configurator, the R640 can take CPUs up to 205W. So we have headroom in terms of cooling.
In a situation where you're running out of power before you run out of rack space, look at some 1.5U and 2U high chassist that use 60mm height fans, which are much more efficient in ratio of air moved per time period vs watts.
Or ask the colo to turn the A/C lower ;) (that moves the power problem elsewhere, I know) Thanks, -Brian
Server PS maximum input wattage is 900W. Present draw of 2.0A @ 208V is ~420W, so 420/900 = 46.67%
But in the real world an R640 would *never* draw 900W. Even if you were to load it up with the maximal CPU configuration (2 x 125W TDP CPU per socket), a full load of 2.5" 15K spinning drives, maximum RAM, and three high wattage low-profile PCI-E cards, while simultaneously running CPU, RAM and disk stress tests, you might get in the neighborhood of 550-600W under load. Much the same way that a desktop PC equipped with a nominally "850W" rated active PFC 80+ gold power supply might be powering a motherboard and CPU combo with a high CPU TDP, but total power consumption under stress tests/benchmarks would be nowhere near 850W. That rating exists to ensure that the power supply isn't running anywhere near its max capacity... On Fri, Mar 5, 2021 at 3:33 PM Brian Knight <ml@knight-networks.com> wrote:
On 2021-03-05 15:40, Eric Kuhnke wrote:
For comparison purposes, I'm curious about the difference in wattage results between:
a) Your R640 at 420W running DPDK
b) The same R640 hardware temporarily booted from a Ubuntu server live USB, in which some common CPU stress and memory disk/IO benchmarks are being run to intentionally load the system to 100% to characterize its absolute maximum AC load wattage.
We've got a few more hosts waiting to be deployed that are configured almost identically. I'll see what we can do.
I'm guessing those tests would pull slightly more power than the vEdge hosts, just because there's not much disk IO that happens on a networking VM. These hosts have four SSDs for local storage.
What's the delta between the 420W and absolute maximum load the server is capable of pulling on the 208VAC side?
https://manpages.ubuntu.com/manpages/artful/man1/stress-ng.1.html
Server PS maximum input wattage is 900W. Present draw of 2.0A @ 208V is ~420W, so 420/900 = 46.67%
One possible factor is whether ESXI is configured to pass the pci-e devices directly through to the guest VM, or if there is any abstraction in between. For non-ESXI stuff, in the world of Xen or KVM there's many different ways that a guest domU can access a dom0's network devices, some of which can have impact on overall steady-state wattage consumed by the system.
The 420W server has its interfaces routed through the ESXI kernel. We're moving quickly to SR-IOV on new servers.
If the greatest possible efficiency is desired for a number of 1U things, one thing to look at would be something similar to the open compute platform single centralized AC to DC power units, and servers that don't each have their own discrete 110-240VAC single or dual power supplies. In terms of cubic meters of air moved per hour vs wattage, the fans found in 1U servers are really quite inefficient. As a randomly chosen example of 12VDC 40mm (1U server height) fan:
https://www.shoppui.com/documents/9HV0412P3K001.pdf
If you have a single 12.0VDC fan that's a maximum load of 1.52A, that's a possible load of up to 18.24W for just *one* 40mm height fan. And your typical high speed dual socket 1U server may have up to eight or ten of those, in the typical front to back wind tunnel configuration. Normally fans won't be running at full speed, so each one won't be a 18W load, but more like 10-12W per fan is totally normal. Plus two at least two more fans in both hot swap power supplies. Under heavy load I would not be surprised at all to say that 80W to 90W of your R640's total 420W load is ventilation.
Which is of course dependent on the environmentals. Fan speeds on our two servers are 25% for the 260W vs. 29% for 420W, so not much difference. Inlet temp on both is ~17C.
I checked out another R640 heavily loaded with vEdge VMs, and it's pulling similar power, 415W, but the fan speed is at 45%, because inlet temp is 22C.
The TDP for the Xeon 6152 is 140W, which seems middle-of-the-road. From the quick survey I did of Dell's configurator, the R640 can take CPUs up to 205W. So we have headroom in terms of cooling.
In a situation where you're running out of power before you run out of rack space, look at some 1.5U and 2U high chassist that use 60mm height fans, which are much more efficient in ratio of air moved per time period vs watts.
Or ask the colo to turn the A/C lower ;) (that moves the power problem elsewhere, I know)
Thanks,
-Brian
On Mon, Feb 22, 2021 at 11:24 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Beyond RX/TX CPU affinity, in DANOS you can further tune power consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.”
Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour. Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) -
Hi, Since you understand that, I'm not really clear what you're asking in the survey. DPDK doesn't inherently do much in the way of power management. The polling loops are in the application side of the software, not the DPDK libraries or NIC driver. It's up to the application author to decide to detect idleness in the polling loop and take action to reduce CPU load. If they go for a simple busy-wait, the dataplane cores run at 100% all the time regardless of packet load. This has the expected impact on the server's power consumption. Note that DPDK applications are usually intended to run in very-high data rate environments where no gains are likely to be realized by avoiding a busy-wait loop. Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
DPDK doesn't inherently do much in the way of power management.
I agree - it doesn't. That's not what it was made for. Note that DPDK applications are usually intended to run in very-high data rate environments where no gains are likely to be realized by avoiding a busy-wait loop. That's not what research shows. Use of LPI states is proposed for power management under high data rate conditions in [5] and in [6], use of the low-power instruction *halt * is investigated and found to save power under such conditions. Cheers, Etienne [3] X. Li, W. Cheng, T. Zhang, F. Ren, and B. Yang, “Towards Power Efficient High Performance Packet I/O,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 4, pp. 981–996, April 2020, ISSN:1558-2183. DOI: 10.1109/TPDS.2019.2957746 [5] R. Bolla, R. Bruschi, F. Davoli, and J. F. Pajo, “A Model-Based Approach Towards Real-Time Analytics in NFV Infrastructures,” IEEE Transactions on Green Communications and Networking, vol. 4, no. 2, pp. 529–541, Jun. 2020, ISSN: 2473-2400. DOI: 10.1109/TGCN.2019.2961192. On Tue, Feb 23, 2021 at 11:04 PM William Herrin <bill@herrin.us> wrote:
On Mon, Feb 22, 2021 at 11:24 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Beyond RX/TX CPU affinity, in DANOS you can further tune power
consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.”
Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour. Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) -
Hi,
Since you understand that, I'm not really clear what you're asking in the survey.
DPDK doesn't inherently do much in the way of power management. The polling loops are in the application side of the software, not the DPDK libraries or NIC driver. It's up to the application author to decide to detect idleness in the polling loop and take action to reduce CPU load. If they go for a simple busy-wait, the dataplane cores run at 100% all the time regardless of packet load. This has the expected impact on the server's power consumption.
Note that DPDK applications are usually intended to run in very-high data rate environments where no gains are likely to be realized by avoiding a busy-wait loop.
Regards, Bill Herrin
-- William Herrin bill@herrin.us https://bill.herrin.us/
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Oh dear ... instead of "and in [6]", I should have written "and in [3]". On Tue, Feb 23, 2021 at 11:21 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
DPDK doesn't inherently do much in the way of power management.
I agree - it doesn't. That's not what it was made for.
Note that DPDK applications are usually intended to run in very-high
data rate environments where no gains are likely to be realized by avoiding a busy-wait loop.
That's not what research shows.
Use of LPI states is proposed for power management under high data rate conditions in [5] and in [6], use of the low-power instruction *halt * is investigated and found to save power under such conditions.
Cheers,
Etienne
[3] X. Li, W. Cheng, T. Zhang, F. Ren, and B. Yang, “Towards Power Efficient High Performance Packet I/O,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 4, pp. 981–996, April 2020, ISSN:1558-2183. DOI: 10.1109/TPDS.2019.2957746
[5] R. Bolla, R. Bruschi, F. Davoli, and J. F. Pajo, “A Model-Based Approach Towards Real-Time Analytics in NFV Infrastructures,” IEEE Transactions on Green Communications and Networking, vol. 4, no. 2, pp. 529–541, Jun. 2020, ISSN: 2473-2400. DOI: 10.1109/TGCN.2019.2961192.
On Tue, Feb 23, 2021 at 11:04 PM William Herrin <bill@herrin.us> wrote:
On Mon, Feb 22, 2021 at 11:24 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Beyond RX/TX CPU affinity, in DANOS you can further tune power
consumption by changing the adaptive polling rate. It doesn’t, per the survey, "keep utilization at 100% regardless of packet activity.”
Robert, you seem to be conflating DPDK with DANOS' power control algorithms that modulate DPDK's default behaviour. Keep in mind that this is a bare-bones survey intended for busy, knowledgeable people (the ones you'd find on NANOG) -
Hi,
Since you understand that, I'm not really clear what you're asking in the survey.
DPDK doesn't inherently do much in the way of power management. The polling loops are in the application side of the software, not the DPDK libraries or NIC driver. It's up to the application author to decide to detect idleness in the polling loop and take action to reduce CPU load. If they go for a simple busy-wait, the dataplane cores run at 100% all the time regardless of packet load. This has the expected impact on the server's power consumption.
Note that DPDK applications are usually intended to run in very-high data rate environments where no gains are likely to be realized by avoiding a busy-wait loop.
Regards, Bill Herrin
-- William Herrin bill@herrin.us https://bill.herrin.us/
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
On Tue, Feb 23, 2021 at 2:22 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
DPDK doesn't inherently do much in the way of power management.
I agree - it doesn't. That's not what it was made for.
Note that DPDK applications are usually intended to run in very-high
data rate environments where no gains are likely to be realized by avoiding a busy-wait loop.
That's not what research shows.
Use of LPI states is proposed for power management under high data rate conditions in [5] and in [6], use of the low-power instruction halt is investigated and found to save power under such conditions.
Howdy, This is way too deep in the weeds of developing with the DPDK libraries for your audience here to have much in the way of useful comment. This is an operators group. If anyone is interested, the techniques DPDK offers application authors to manage power on the dataplane cores are described here: https://doc.dpdk.org/guides/prog_guide/power_man.html The main thing devs do, since it's easy, is add a call to rte_pause() in any empty polling loop. IIRC, that just calls the CPU PAUSE instruction which doesn't actually pause anything but saves a little power by de-pipelining and, if hyperthreading is enabled, releasing the core to run the alternate thread. Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
This is way too deep in the weeds of developing with the DPDK libraries for your audience here to have much in the way of useful comment. This is an operators group.
Fair enough, and thank you for stepping on the brakes :) Honestly, I didn't intend to get embroiled in this. The questions were bare-bones and relate to common use of DPDK. Over and out. I'll post the results on Friday evening CET. Cheers, Etienne On Tue, Feb 23, 2021 at 11:38 PM William Herrin <bill@herrin.us> wrote:
On Tue, Feb 23, 2021 at 2:22 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
DPDK doesn't inherently do much in the way of power management.
I agree - it doesn't. That's not what it was made for.
Note that DPDK applications are usually intended to run in very-high
data rate environments where no gains are likely to be realized by avoiding a busy-wait loop.
That's not what research shows.
Use of LPI states is proposed for power management under high data rate conditions in [5] and in [6], use of the low-power instruction halt is investigated and found to save power under such conditions.
Howdy,
This is way too deep in the weeds of developing with the DPDK libraries for your audience here to have much in the way of useful comment. This is an operators group.
If anyone is interested, the techniques DPDK offers application authors to manage power on the dataplane cores are described here:
https://doc.dpdk.org/guides/prog_guide/power_man.html
The main thing devs do, since it's easy, is add a call to rte_pause() in any empty polling loop. IIRC, that just calls the CPU PAUSE instruction which doesn't actually pause anything but saves a little power by de-pipelining and, if hyperthreading is enabled, releasing the core to run the alternate thread.
Regards, Bill Herrin
-- William Herrin bill@herrin.us https://bill.herrin.us/
-- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
Thanks Jared; that's very interesting. Earlier today, I had a private exchange of emails regarding the progressive development of architectures specific to the domain of high-speed networking functions. Your note reinforces the notion that this “hard” partitioning of cores is a key part of the DSA (domain-specific architecture) here. Sent from my Windows 10 device From: Jared Geiger Sent: Monday, 22 February 2021 20:53 To: NANOG Subject: Re: DPDK and energy efficiency DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization. I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself. I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc. ~Jared On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote: Hello folks, I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK. As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK. I've drawn up a bare-bones, 2-question survey at this link: https://www.surveymonkey.com/r/J886DPY. Responses have been set to anonymous. Cheers, Etienne -- Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta Web. https://www.um.edu.mt/profile/etiennedepasquale
"set system default dataplane cpu-affinity 3-7" is what I have set for my use case. Technically its 5 cores out of 8 total, but 4 are polling cores and 1 manages those 4. Then the control plane is 3 plus the leftover cycles of the 1 manager core. On Mon, Feb 22, 2021 at 2:04 PM Etienne Depasquale <edepa@ieee.org> wrote:
Thanks Jared; that's very interesting.
Earlier today, I had a private exchange of emails regarding the progressive development of architectures specific to the domain of high-speed networking functions. Your note reinforces the notion that this “hard” partitioning of cores is a key part of the DSA (domain-specific architecture) here.
Sent from my Windows 10 device
*From: *Jared Geiger <jared@compuwizz.net> *Sent: *Monday, 22 February 2021 20:53 *To: *NANOG <nanog@nanog.org> *Subject: *Re: DPDK and energy efficiency
DANOS lets you specify how many dataplane cores you use versus control plane cores. So if you put a 16 core host in to handle 2GB of traffic, you can adjust the dataplane worker cores as needed. Control plane cores don't stay at 100% utilization.
I use that technique plus DANOS runs on VMware (not oversubscribed) which allows me to use the hardware for other VMs. NICS are attached to the VM via PCI Passthrough which helps eliminate the overhead to the VMware hypervisor itself.
I have an 8 core VM with 4 cores set to dataplane and 4 to control plane. The 4 control plane cores are typically idle only processing BGP route updates, SNMP, logs, etc.
~Jared
On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale <edepa@ieee.org> wrote:
Hello folks,
I've just followed a thread regarding use of CGNAT and noted a suggestion (regarding DANOS) that includes use of DPDK.
As I'm interested in the breadth of adoption of DPDK, and as I'm a researcher into energy and power efficiency, I'd love to hear your feedback on your use of power consumption control by DPDK.
I've drawn up a bare-bones, 2-question survey at this link:
https://www.surveymonkey.com/r/J886DPY.
Responses have been set to anonymous.
Cheers,
Etienne
--
Ing. Etienne-Victor Depasquale Assistant Lecturer Department of Communications & Computer Engineering Faculty of Information & Communication Technology University of Malta
participants (12)
-
Brian Knight
-
Douglas Fischer
-
Eric Kuhnke
-
Etienne Depasquale
-
Etienne-Victor Depasquale
-
Jared Geiger
-
Nick Hilliard
-
Pawel Malachowski
-
Robert Bays
-
Shane Ronan
-
Tom Hill
-
William Herrin