Switches, routers, firewalls and other networking devices typically need to process large volumes of packets in real time. Traditionally, efficient packet processing was done using specialized and expensive hardware. Data Plane Development Kit (DPDK) enables us to do this on low-cost commodity hardware. By using commodity hardware, we can also move networking functions to the cloud and run them within virtualized environments. DPDK enables innovation around multi-core CPUs, edge computing, real-time security, NFV/SDN, low-latency applications and more.
How did vendors achieve efficient packet processing before DPDK?
Before DPDK, specialized hardware did efficient packet processing. Such hardware might use custom ASICs, programmable FPGAs or Network Processing Units (NPUs). Sometimes low-level hardware-specific microcode or custom firmware could be present. Packet classification, flow control, TCP/IP processing, encryption/decryption, VLAN tagging, and checksum calculation example tasks that these specialized hardware did in an optimized manner.
However, such hardware were expensive to buy and maintain. Upgrades and security patches were time-consuming to apply and needed a full-time network administrator. One solution was move from specialized hardware to Commercial Off-the-Shelf (COTS) hardware. While this was more cost effective and easier to maintain, performance suffered. Packets moved from Network Interface Cards (NICs) to the operating system (OS), where they were processed via the OS kernel stack.
How does DPDK improve packet processing?
DPDK bypasses the kernel and enables fast packet processing in userspace. It's essentially a set of network drivers and libraries. Environment Abstraction Layer (EAL) abstracts hardware-specific operations from applications. The figure shows how traditional processing with POSIX calls goes through the kernel space before packets reach the application. DPDK short-circuits this path and moves packets directly between NIC and userspace applications.
Traditional processing is interrupt driven where the NIC interrupts the kernel when a packet arrives. DPDK uses polling instead and avoids the overhead associated with interrupts. This is performed by a Poll Mode Driver (PMD).
What's the packet processing model adopted by DPDK?
- Run-to-Completion: A CPU core handles receive, processing and transmit of a packet. Multiple cores can be used with each core associated with a dedicated port. However, with Receive Side Scaling (RSS), traffic arriving at a single port can be distributed to multiple cores.
- Pipeline: Each core is dedicated to a specific workload. For example, one core might handle receive/transmit of packets while other cores handle application processing. Packets are passed between cores via memory rings.
For single-core multi-CPU deployments, one CPU is assigned to the OS and the other to the DPDK-based application. A less performant variant is when packets are accessed via the QPI interface. For multi-core deployments, we can assign more than one core to each port with or without hyperthreading.
Deciding which model to use is not trivial. Some things to consider are cycles needed to process each packet, extent of data exchange across software modules, specific optimization at some cores, code maintainability, etc. Intel VTune Profiler can be used to analyze the efficiency of the pipeline model.
What other techniques does DPDK use to improve performance?
- Processor Affinity: Ties specific processing to specific cores.
- Huge Pages: Reduces TLB cache misses.
- Lockless Sync: Queues are managed with the ring library. Enqueue and dequeue operations are lockless.
- I/O Batch: Process a batch of packets rather than one at a time. This amortizes overheads in accessing the NIC.
- NUMA Aware: Utilizes NUMA memory for better performance.
- Cache Alignment: Align structures to 64-byte cache lines.
What are some metrics used to evaluate DPDK performance?
Throughput is the most common metric. Often this is quoted in Mbps, Gbps or Mega Packets Per Second (MPPS). When MPPS is used, packet size must be mentioned. At a low packet size of 64 bytes, throughput will be limited by MPPS rather than Mbps. The figure above shows throughput at L2.
In 2015, per core L3 performance on Linux was 1.1 MPPS. On Intel DPDK, this was 28.5 MPPS. In 2010, DPDK could achieve L3 performance of 55 MPPS with 64-byte packets on an Intel Xeon-based system. This was improved to 255 MPPS (2014) and 347 MPPS (2016).
Latency is important for low-latency applications. The figure above shows that DPDK reduces latency by a factor of ten on 64-byte packets. Likewise, for real-time applications, jitter is important. One study showed that DPDK reduces jitter from 10μs to 2μs.
Does DPDK need a TCP/IP stack to work?
DPDK doesn't include a TCP/IP stack. If an application requires a userspace networking stack, it can use F-Stack, mTCP, TLDK, Seastar and Accelerated Network Stack (ANS). These typically provide both blocking and non-blocking socket APIs. Some of these are based on FreeBSD implementation.
By omitting a networking stack, DPDK doesn't have the inefficiency of a generic implementation. Applications can include networking modules optimized for their use cases. There might also be some use cases where no higher layer (above L2) processing is needed.
Who in industry is using DPDK?
Load balancing, flow classification, routing, access control (firewall), and traffic policing are typical uses of DPDK. There's a misconception that DPDK is only for the telecom industry. However, DPDK has been used in cloud environments and enterprises alike. Traffic generators (TRex) and storage applications (SPDK) use DPDK. The figure above lists open-source projects powered by DPDK.
In 5G, the User Plane Function (UPF) processes user data packets. Delay, jitter and bandwidth are key performance metrics that need to be met. Some researchers have proposed DPDK for 5G UPF implementation. For deploying UPF at edge networks, DPDK APIs can be used to interface the UPF application (UPF-C) and SmartNICs (UPF-U).
What are the challenges with DPDK?
For example, PID namespace can cause problems with managing fbarray. Processes using mmap without specifying addresses can cause problems. Called thread or core affinity, threads must be assigned correctly to CPU cores for consistent performance. DPDK libraries offer developers many implementation choices. Getting these choices wrong can impact performance.
Since kernel is bypassed, we lose all the protection, utilities (
tcpdump) and protocols (ARP, IPSec) that the Linux kernel provides. Debugging and identifying root causes for networking problems are challenges. However, DPDK's tracing library and LLTng may help. A poor implementation can cause other processes/programs to fail.
What are some alternatives to DPDK?
Faster packet processing via kernel bypass is possible using Snabbswitch, Netmap or StackMap. Like DPDK, these process packets in the userspace. Packets completely bypass the kernel stack. Snabbswitch is written in Lua while DPDK is in C. PacketShader does kernel bypass for GPU-based hardware.
An alternative approach is to modify the Linux kernel. Examples include eXpress Data Path (XDP) and network stacks based on Remote Direct Memory Access (RDMA). Other efficient tools include packet_mmap (but doesn't bypass the kernel) and PF_RING (with ZC drivers).
Cloud providers may not wish to dedicate an entire NIC for a single offloaded application. Solarflare uses OpenOnload that uses the EF_VI proprietary library. It creates a "hidden queue" at the NIC that userspace processes can access. This approach is sometimes called bifurcated driver or queue splitting. A similar approach exists in the virtualization world to avoid the overhead of passing packets from host to VM. Fake virtual interfaces allow packets to reach the VM directly. In general, modern NICs can bifurcate traffic to use or skip kernel stack. SR-IOV and VFIO technologies enable this.
Intel publishes a technology guide explaining its Data Streaming Accelerator (DSA). While DPDK aims to avoid data copying, sometimes this is unavoidable. Such copying operations can be offloaded to Direct Memory Access (DMA) accelerators such as DSA. DPDK library that enables this is
dmadev, available since DPDK v21.11 (November 2021).
Belkhiri et al. propose instrumenting DPDK libraries and collecting trace information. These traces can be used to debug performance issues and identify root causes. They claim that their approach is better than what existing tools VTune Amplifier (closed source), FlowWatcher-DPDK and DPDKStat are capable of.
- ACL Digital. 2021. "Refuting the Top Misconceptions of DPDK." Blog, ACL Digital, August 6. Accessed 2023-09-09.
- ANS. 2019. "Releases." ANS, on GitHub, February 14. Accessed 2023-09-08.
- Barbette, T., C. Soldani, and L. Mathy. 2015. "Fast userspace packet processing." ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS), Oakland, CA, USA, pp. 5-16. doi: 10.1109/ANCS.2015.7110116. Accessed 2023-09-12.
- Belkhiri, A., M. Pepin, M. Bly, and M. Dagenais. 2023. "Performance analysis of DPDK-based applications through tracing." J. of Parallel and Dist Computing, vol. 173, pp. 1-19, March. Accessed 2023-09-09.
- Bo, C. 2023. "Coroutine Made DPDK Development Easy." Blog, Alibaba Cloud, May 12. Accessed 2023-09-08.
- Cascardo. 2015. "Getting the Best of Both Worlds with Queue Splitting (Bifurcated Driver)." Blog, RedHat, October 2. Accessed 2023-09-12.
- Chen, W.-E. and C. H. Liu. 2020. "High-performance user plane function (UPF) for the next generation core networks." Special Issue: Intelligent Computing: a Promising Network Computing Paradigm, November 3. Accessed 2023-09-08.
- Chen, R. and G. Sun. 2018. "A Survey of Kernel-Bypass Techniques in Network Stack." CSAI '18: Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence, pp. 474–477, December. doi: 10.1145/3297156.3297242. Accessed 2023-09-08.
- Cochinwala, Naveen. 2021. "Difficulties of a DPDK Implementation." Blog, on LinkedIn, January 16. Accessed 2023-09-12.
- DPDK. 2014. "DPDK Summit, San Francisco." Events, DPDK, September 8. Accessed 2023-09-08.
- DPDK. 2015. "Horizontal log with tag." DPDK, October 1.Accessed 2023-09-08.
- DPDK. 2019a. "About DPDK." DPDK, November 15. Accessed 2023-09-08.
- DPDK. 2019b. "In Loving Memory: Venky Venkatesan, The Father of DPDK." DPDK, August 16. Accessed 2023-09-08.
- DPDK. 2019c. "DPDK Release 19.05." Documentation, DPDK, May. Accessed 2023-09-08.
- DPDK. 2020. "DPDK Issues 20.11, Most Robust DPDK Release Ever!" Blog, DPDK, November 30. Accessed 2023-09-08.
- DPDK. 2023a. "DPDK Stable: refs." DPDK Git. Accessed 2023-09-08.
- DPDK. 2023b. "Release Notes." v23.07.0, Documentation, DPDK, July. Accessed 2023-09-08.
- DPDK. 2023c. "Past Events." DPDK. Accessed 2023-09-08.
- DPDK. 2023d. "Getting Started Guide for Windows: Introduction." v23.07.0, Documentation, DPDK, July. Accessed 2023-09-08.
- DPDK. 2023e. "Ecosystem." DPDK, July 27. Accessed 2023-09-09.
- DPDK. 2023f. "Tracing Library." Sec. 6, v23.07.0, Documentation, DPDK, July. Accessed 2023-09-12.
- DPDK. 2023g. "Ring Library." Sec. 8, v23.07.0, Documentation, DPDK, July. Accessed 2023-09-12.
- DPDK. 2023h. "Poll Mode Driver." Sec. 8, v23.07.0, Documentation, DPDK, July. Accessed 2023-09-14.
- ETSI. 2012. "Network Functions Virtualisation." White paper, SDN and OpenFlow World Congress, Darmstadt, Germany, October 22-24. Accessed 2023-09-08.
- F-Stack. 2022. "Releases." F-Stack, on GitHub, September 2. Accessed 2023-09-08.
- Gaio, G. and G. Scalamera. 2019. "Development of Ethernet based real-time applications in Linux using DPDK." 17th Int. Conf. on Acc. and Large Exp. Physics Control Systems, NY, USA. doi: 10.18429/JACoW-ICALEPCS2019-MOPHA044. Accessed 2023-09-12.
- Haryachyy, D. 2015. "Understanding DPDK." SlideShare, February 12. Accessed 2023-09-12.
- Intel. 2015. "Introduction to DPDK." Slides, September. Accessed 2023-09-08.
- Intel. 2017. "Introduction to the DPDK Packet Framework." Technical paper, Intel. Accessed 2023-09-12.
- Intel. 2023. "DPDK Event Device Profiling." Documentation, Intel® VTune™ Profiler Performance Analysis Cookbook, March 10. Accessed 2023-09-12.
- Klaff, B. and B. Perlman. 2019. "Using DPDK APIs as the I/F between UPF-C and UPF-U." DPDK, on YouTube, November 22. Accessed 2023-09-08.
- Lai, L., G. Ara, T. Cucinotta, K. Kondepu, and L. Valcarenghi. 2021. "Ultra-low Latency NFV Services Using DPDK." IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Heraklion, Greece, pp. 8-14. doi: 10.1109/NFV-SDN53031.2021.9665131. Accessed 2023-09-12.
- Majkowski, M. 2015. "Kernel bypass." Blog, Cloudflare, July 9. Accessed 2023-09-12.
- NXP. 2017. "DPDK Overview." QorIQ SDK v2.0-1703 Documentation, August 7. Accessed 2023-09-08.
- O'Driscoll, T. 2015. "[dpdk-dev] DPDK Logo Release." Email archives, DPDK, October 1. Accessed 2023-09-08.
- Ramia, K. B. and D. K. Jain. 2017. "DPDK Architecture and Roadmap Discussion." Slides, DPDK Summit India, April 25-26. Accessed 2023-09-12.
- Richardson, B. 2022. "Intel® Data Streaming Accelerator (DSA) - Packet Copy Offload in DPDK with Intel® DSA." Technology guide, v001, Intel, December. Accessed 2023-09-08.
- Shukla, M. 2018. "DPDK in 3 Minutes or Less…" Blog, Calsoft Inc., November 20. Accessed 2023-09-08.
- Srinivas, R. K. 2020. "Design of high performance Automotive Network simulators — DPDK approach." Medium, February 9. Accessed 2023-09-12.
- The Linux Foundation. 2017. "Networking Industry Leaders Join Forces to Expand New Open Source Community to Drive Development of the DPDK Project." Press release, The Linux Foundation, April 3. Accessed 2023-09-08.
- Yong, W. 2019. "DPDK & Containers : Challenges + Solutions." Slides, DPDK Summit North America, November 12. Accessed 2023-09-12.
- Zhang, H., Z. Chen, and Y. Yuan. 2021. "High-Performance UPF Design Based on DPDK." IEEE 21st International Conference on Communication Technology (ICCT), Tianjin, China, pp. 349-354. doi: 10.1109/ICCT52962.2021.9657903. Accessed 2023-09-08.
- mTCP. 2018. "Releases." mTCP, on GitHub, October 8. Accessed 2023-09-08.