Skip to content

Commit 93cb537

Browse files
ameryhungMartin KaFai Lau
authored and
Martin KaFai Lau
committed
Merge branch 'bpf-qdisc'
Amery Hung says: ==================== bpf qdisc Hi all, This patchset aims to support implementing qdisc using bpf struct_ops. This version takes a step back and only implements the minimum support for bpf qdisc. 1) support of adding skb to bpf_list and bpf_rbtree directly and 2) classful qdisc are deferred to future patchsets. In addition, we only allow attaching bpf qdisc to root or mq for now. This is to prevent accidentally breaking exisiting classful qdiscs that rely on data in a child qdisc. This limit may be lifted in the future after careful inspection. * Overview * This series supports implementing qdisc using bpf struct_ops. bpf qdisc aims to be a flexible and easy-to-use infrastructure that allows users to quickly experiment with different scheduling algorithms/policies. It only requires users to implement core qdisc logic using bpf and implements the mundane part for them. In addition, the ability to easily communicate between qdisc and other components will also bring new opportunities for new applications and optimizations. * Performance of bpf qdisc * This patchset includes two qdisc examples, bpf_fifo and bpf_fq, for __testing__ purposes. For performance test, we compare selftests and their kernel counterparts to give you a sense of the performance of qdisc implemented in bpf. The implementation of bpf_fq is fairly complex and slightly different from fq so later we only compare the two fifo qdiscs. bpf_fq implements a scheduling algorithm similar to fq before commit 29f834a ("net_sched: sch_fq: add 3 bands and WRR scheduling") was introduced. bpf_fifo uses a single bpf_list as a queue instead of three queues for different priorities in pfifo_fast. The time complexity of fifo however should be similar since the queue selection time is negligible. Test setup: client -> qdisc -------------> server ~~~~~~~~~~~~~~~ ~~~~~~ nested VM1 @ DC1 VM2 @ DC2 Throghput: iperf3 -t 600, 5 times Qdisc Average (GBits/sec) ---------- ------------------- pfifo_fast 12.52 ± 0.26 bpf_fifo 11.72 ± 0.32 fq 10.24 ± 0.13 bpf_fq 11.92 ± 0.64 Latency: sockperf pp --tcp -t 600, 5 times Qdisc Average (usec) ---------- -------------- pfifo_fast 244.58 ± 7.93 bpf_fifo 244.92 ± 15.22 fq 234.30 ± 19.25 bpf_fq 221.34 ± 10.76 Looking at the two fifo qdiscs, the 6.4% drop in throughput in the bpf implementatioin is consistent with previous observation (v8 throughput test on a loopback device). This should be able to be mitigated by supporting adding skb to bpf_list or bpf_rbtree directly in the future. * Clean up skb in bpf qdisc during reset * The current implementation relies on bpf qdisc implementors to correctly release skbs in queues (bpf graphs or maps) in .reset, which might not be a safe thing to do. The solution as Martin has suggested would be supporting private data in struct_ops. This can also help simplifying implementation of qdisc that works with mq. For examples, qdiscs in the selftest mostly use global data. Therefore, even if user add multiple qdisc instances under mq, they would still share the same queue. ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
1 parent 69dfd77 commit 93cb537

File tree

2 files changed

+76
-0
lines changed

2 files changed

+76
-0
lines changed

tools/testing/selftests/bpf/config

+1
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ CONFIG_NET_MPLS_GSO=y
7474
CONFIG_NET_SCH_BPF=y
7575
CONFIG_NET_SCH_FQ=y
7676
CONFIG_NET_SCH_INGRESS=y
77+
CONFIG_NET_SCH_HTB=y
7778
CONFIG_NET_SCHED=y
7879
CONFIG_NETDEVSIM=y
7980
CONFIG_NETFILTER=y

tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c

+75
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,77 @@ static void test_fq(void)
8888
bpf_qdisc_fq__destroy(fq_skel);
8989
}
9090

91+
static void test_qdisc_attach_to_mq(void)
92+
{
93+
DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook,
94+
.attach_point = BPF_TC_QDISC,
95+
.parent = TC_H_MAKE(1 << 16, 1),
96+
.handle = 0x11 << 16,
97+
.qdisc = "bpf_fifo");
98+
struct bpf_qdisc_fifo *fifo_skel;
99+
struct bpf_link *link;
100+
int err;
101+
102+
fifo_skel = bpf_qdisc_fifo__open_and_load();
103+
if (!ASSERT_OK_PTR(fifo_skel, "bpf_qdisc_fifo__open_and_load"))
104+
return;
105+
106+
link = bpf_map__attach_struct_ops(fifo_skel->maps.fifo);
107+
if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) {
108+
bpf_qdisc_fifo__destroy(fifo_skel);
109+
return;
110+
}
111+
112+
SYS(out, "ip link add veth0 type veth peer veth1");
113+
hook.ifindex = if_nametoindex("veth0");
114+
SYS(out, "tc qdisc add dev veth0 root handle 1: mq");
115+
116+
err = bpf_tc_hook_create(&hook);
117+
ASSERT_OK(err, "attach qdisc");
118+
119+
bpf_tc_hook_destroy(&hook);
120+
121+
SYS(out, "tc qdisc delete dev veth0 root mq");
122+
out:
123+
bpf_link__destroy(link);
124+
bpf_qdisc_fifo__destroy(fifo_skel);
125+
}
126+
127+
static void test_qdisc_attach_to_non_root(void)
128+
{
129+
DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex = LO_IFINDEX,
130+
.attach_point = BPF_TC_QDISC,
131+
.parent = TC_H_MAKE(1 << 16, 1),
132+
.handle = 0x11 << 16,
133+
.qdisc = "bpf_fifo");
134+
struct bpf_qdisc_fifo *fifo_skel;
135+
struct bpf_link *link;
136+
int err;
137+
138+
fifo_skel = bpf_qdisc_fifo__open_and_load();
139+
if (!ASSERT_OK_PTR(fifo_skel, "bpf_qdisc_fifo__open_and_load"))
140+
return;
141+
142+
link = bpf_map__attach_struct_ops(fifo_skel->maps.fifo);
143+
if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) {
144+
bpf_qdisc_fifo__destroy(fifo_skel);
145+
return;
146+
}
147+
148+
SYS(out, "tc qdisc add dev lo root handle 1: htb");
149+
SYS(out_del_htb, "tc class add dev lo parent 1: classid 1:1 htb rate 75Kbit");
150+
151+
err = bpf_tc_hook_create(&hook);
152+
if (!ASSERT_ERR(err, "attach qdisc"))
153+
bpf_tc_hook_destroy(&hook);
154+
155+
out_del_htb:
156+
SYS(out, "tc qdisc delete dev lo root htb");
157+
out:
158+
bpf_link__destroy(link);
159+
bpf_qdisc_fifo__destroy(fifo_skel);
160+
}
161+
91162
void test_bpf_qdisc(void)
92163
{
93164
struct netns_obj *netns;
@@ -100,6 +171,10 @@ void test_bpf_qdisc(void)
100171
test_fifo();
101172
if (test__start_subtest("fq"))
102173
test_fq();
174+
if (test__start_subtest("attach to mq"))
175+
test_qdisc_attach_to_mq();
176+
if (test__start_subtest("attach to non root"))
177+
test_qdisc_attach_to_non_root();
103178

104179
netns_free(netns);
105180
}

0 commit comments

Comments
 (0)