|
1 | 1 | .. _pru_icssg_xdp: |
2 | 2 |
|
3 | | -************* |
4 | 3 | PRU_ICSSG XDP |
5 | | -************* |
| 4 | +############# |
6 | 5 |
|
7 | 6 | .. contents:: :local: |
8 | 7 | :depth: 3 |
9 | 8 |
|
10 | 9 | Introduction |
11 | | -############ |
| 10 | +************ |
12 | 11 |
|
13 | | -XDP stands for eXpress Data Path and provides a framework for BPF that enables high-performance programmable packet processing in the Linux kernel. It runs the BPF program at the earliest possible point in software, namely at the moment the network driver receives the packet. |
| 12 | +XDP (eXpress Data Path) provides a framework for BPF that enables high-performance programmable packet processing in the Linux kernel. It runs the BPF program at the earliest possible point in software, namely at the moment the network driver receives the packet. |
14 | 13 |
|
15 | 14 | XDP allows running a BPF program just before the skbs are allocated in the driver, the BPF program can look at the packet and return the following things. |
16 | 15 |
|
17 | 16 | - XDP_DROP :- The packet is dropped right away, without wasting any resources. Useful for firewall etc. |
18 | 17 | - XDP_ABORTED :- Similar to drop, an exception is generated. |
19 | 18 | - XDP_PASS :- Pass the packet to kernel stack, i.e. the skbs are allocated and it works normally. |
20 | 19 | - XDP_TX :- Send the packet back to same NIC with modification(if done by the program). |
21 | | -- XDP_REDIRECT :- Send the packet to another NIC or to the userspace through AF_XDP Socket(discussed below). |
| 20 | +- XDP_REDIRECT :- Send the packet to another NIC or to the user space through AF_XDP Socket(discussed below). |
22 | 21 |
|
23 | | -.. Image:: /images/xdp-packet-processing.png |
| 22 | +.. Image:: /images/XDP-packet-processing.png |
24 | 23 |
|
25 | | -As explained above, the XDP_REDIRECT can be used to send a packet directly to the userspace. |
| 24 | +As explained before, the XDP_REDIRECT sends packets directly to the user space. |
26 | 25 | This works by using the AF_XDP socket type which was introduced specifically for this usecase. |
27 | 26 |
|
28 | | -In this process, the packet is directly sent to the userspace without going through the kernel network stack. |
| 27 | +In this process, the packet is directly sent to the user space without going through the kernel network stack. |
29 | 28 |
|
30 | 29 | .. Image:: /images/xdp-packet.png |
31 | 30 |
|
32 | | -Running XDP on EVM |
33 | | -################## |
| 31 | +Use cases for XDP |
| 32 | +================= |
34 | 33 |
|
35 | | -The ICSSG driver supports XDP. Any application based on XDP can use ICSSG XDP Capablities. By default CONFIG_XDP_SOCKETS is enabled in .config of ti-linux-kernel. |
| 34 | +XDP is particularly useful for these common networking scenarios: |
| 35 | + |
| 36 | +1. **DDoS Mitigation**: High-speed packet filtering and dropping malicious traffic |
| 37 | +2. **Load Balancing**: Efficient traffic distribution across multiple servers |
| 38 | +3. **Packet Capture**: High-performance network monitoring without performance penalties |
| 39 | +4. **Firewalls**: Wire-speed packet filtering based on flexible rule sets |
| 40 | +5. **Network Analytics**: Real-time traffic analysis and monitoring |
| 41 | +6. **Custom Network Functions**: Specialized packet handling for unique requirements |
| 42 | + |
| 43 | +How to run XDP with PRU_ICSSG |
| 44 | +============================= |
| 45 | + |
| 46 | +The kernel configuration requires the following changes to use XDP with PRU_ICSSG: |
| 47 | + |
| 48 | +.. code-block:: console |
| 49 | +
|
| 50 | + CONFIG_DEBUG_INFO_BTF=y |
| 51 | + CONFIG_BPF_PRELOAD=y |
| 52 | + CONFIG_BPF_PRELOAD_UMD=y |
| 53 | + CONFIG_BPF_EVENTS=y |
| 54 | + CONFIG_BPF_LSM=y |
| 55 | + CONFIG_DEBUG_INFO_REDUCED=n |
| 56 | + CONFIG_FTRACE=y |
| 57 | + CONFIG_XDP_SOCKETS=y |
| 58 | +
|
| 59 | +Tools for debugging XDP Applications |
| 60 | +==================================== |
| 61 | + |
| 62 | +Debugging tools for XDP development: |
| 63 | + |
| 64 | +- bpftool - For loading and managing BPF programs |
| 65 | +- xdpdump - For capturing XDP packet data |
| 66 | +- perf - For performance monitoring and analysis |
| 67 | +- bpftrace - For tracing BPF program execution |
| 68 | + |
| 69 | +AF_XDP Sockets |
| 70 | +************** |
| 71 | + |
| 72 | +AF_XDP is a socket address family specifically designed to work with the XDP framework. |
| 73 | +These sockets provide a high-performance interface for user space applications to receive |
| 74 | +and transmit network packets directly from the XDP layer, bypassing the traditional kernel networking stack. |
| 75 | + |
| 76 | +Key characteristics of AF_XDP sockets include: |
| 77 | + |
| 78 | +- Direct path from network driver to user space applications |
| 79 | +- Shared memory rings for efficient packet transfer |
| 80 | +- Minimal overhead compared to traditional socket interfaces |
| 81 | +- Optimized for high-throughput, low-latency applications |
| 82 | + |
| 83 | +How AF_XDP Works |
| 84 | +================ |
| 85 | + |
| 86 | +AF_XDP sockets operate through a shared memory mechanism: |
| 87 | + |
| 88 | +1. XDP program intercepts packets at driver level |
| 89 | +2. XDP_REDIRECT action sends packets to the socket |
| 90 | +3. Shared memory rings (RX/TX/FILL/COMPLETION) manage packet data |
| 91 | +4. Userspace application directly accesses the packet data |
| 92 | +5. Zero or minimal copying depending on the mode used |
| 93 | + |
| 94 | +The AF_XDP architecture uses several ring buffers: |
| 95 | + |
| 96 | +- **RX Ring**: Received packets ready for consumption |
| 97 | +- **TX Ring**: Packets to be transmitted |
| 98 | +- **FILL Ring**: Pre-allocated buffers for incoming packets |
| 99 | +- **COMPLETION Ring**: Tracks completed TX operations |
| 100 | + |
| 101 | +For more details on AF_XDP please refer to the official documentation: `AF_XDP Sockets <https://www.kernel.org/doc/html/latest/networking/af_xdp.html>`_. |
| 102 | + |
| 103 | +Current Support Status in PRU_ICSSG |
| 104 | +=================================== |
| 105 | + |
| 106 | +The PRU_ICSSG Ethernet driver currently supports: |
| 107 | + |
| 108 | +- Native XDP mode |
| 109 | +- Generic XDP mode (SKB-based) |
| 110 | +- Zero-copy mode |
| 111 | + |
| 112 | +XDP Zero-Copy in PRU_ICSSG |
| 113 | +************************** |
| 114 | + |
| 115 | +Introduction to Zero-Copy Mode |
| 116 | +============================== |
| 117 | + |
| 118 | +Zero-copy mode is an optimization in AF_XDP that eliminates packet data copying between the kernel and user space. This results in significantly improved performance for high-throughput network applications. |
| 119 | + |
| 120 | +How Zero-Copy Works |
| 121 | +=================== |
| 122 | + |
| 123 | +In standard XDP operation (copy mode), packet data is copied from kernel memory to user space memory when processed. Zero-copy mode eliminates this copy operation by: |
| 124 | + |
| 125 | +1. Using memory-mapped regions shared between the kernel and user space |
| 126 | +2. Allowing direct DMA from network hardware into memory accessible by user space applications |
| 127 | +3. Managing memory ownership through descriptor rings rather than data movement |
| 128 | + |
| 129 | +This approach provides several benefits: |
| 130 | +- Reduced CPU utilization |
| 131 | +- Lower memory bandwidth consumption |
| 132 | +- Decreased latency for packet processing |
| 133 | +- Improved overall throughput |
| 134 | + |
| 135 | +Requirements for Zero-Copy |
| 136 | +========================== |
| 137 | + |
| 138 | +For zero-copy to function properly with PRU_ICSSG, ensure: |
| 139 | + |
| 140 | +1. **Driver Support**: Verify the PRU_ICSSG driver is loaded with zero-copy support enabled |
| 141 | +2. **Memory Alignment**: Buffer addresses must be properly aligned to page boundaries |
| 142 | +3. **UMEM Configuration**: The UMEM area must be correctly configured: |
| 143 | + - Properly aligned memory allocation |
| 144 | + - Sufficient number of packet buffers |
| 145 | + - Appropriate buffer sizes |
| 146 | +4. **Hugepages**: Using hugepages for UMEM allocation is recommended for optimal performance |
| 147 | + |
| 148 | +Performance Comparison |
| 149 | +====================== |
| 150 | + |
| 151 | +Performance testing shows that zero-copy mode can provide substantial throughput improvements compared to copy mode: |
| 152 | + |
| 153 | +`xdpsock <https://github.com/xdp-project/bpf-examples/tree/main/AF_XDP-example>`_ opensource tool was used for testing XDP zero copy. |
| 154 | +AF_XDP performance while using 64 byte packets in Kpps: |
| 155 | + |
| 156 | +.. list-table:: |
| 157 | + :header-rows: 1 |
| 158 | + |
| 159 | + * - Benchmark |
| 160 | + - XDP-SKB |
| 161 | + - XDP-Native |
| 162 | + - XDP-Native(ZeroCopy) |
| 163 | + * - rxdrop |
| 164 | + - 253 |
| 165 | + - 473 |
| 166 | + - 656 |
| 167 | + * - txonly |
| 168 | + - 350 |
| 169 | + - 354 |
| 170 | + - 855 |
| 171 | + |
| 172 | +Performance Considerations |
| 173 | +========================== |
| 174 | + |
| 175 | +When implementing XDP applications, consider these performance factors: |
| 176 | + |
| 177 | +1. **Memory Alignment**: Buffers should be aligned to page boundaries for optimal performance |
| 178 | +2. **Batch Processing**: Process multiple packets in batches when possible |
| 179 | +3. **Poll Mode**: Use poll() or similar mechanisms to avoid blocking on socket operations |
| 180 | +4. **Core Affinity**: Bind application threads to specific CPU cores to reduce cache contention |
| 181 | +5. **NUMA Awareness**: Consider NUMA topology when allocating memory for packet buffers |
0 commit comments