Skip to content

WIP: use netlink instead of calling iproute2 commands #1990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

gwenya
Copy link
Contributor

@gwenya gwenya commented Apr 25, 2025

See #1978

This is still WIP, the PR at this time is mainly for running the CI. Feedback is very welcome though, especially on the non-trivial TODOs.

@gwenya
Copy link
Contributor Author

gwenya commented Apr 25, 2025

@stgraber we currently don't have any tests specifically for this package, if any of the functions are tested then only incidentally.
Unit tests don't really work since this interacts with the system and there's no point in mocking that.
I'm thinking maybe a test program that can run any of these functions via reflection, and then a test script that calls said program and uses iproute2 tools to check if it does things correctly.
And ideally I'd like to be able to run these tests locally without messing up my network setup, so in a VM maybe. I'm not sure if that fits into our current testing system somehow?
Some of these might not even be possible to test in the CI, for example vrf is a kernel module that may or may not be loaded.

@gwenya gwenya force-pushed the netlink branch 8 times, most recently from a27e9ff to 687b6a6 Compare April 25, 2025 16:04
@stgraber
Copy link
Member

Yeah, it's not the kind of code that's particularly directly testable, it makes more sense to have it be indirectly tested through or normal system tests and the daily test runs we have on Jenkins too.

@gwenya
Copy link
Contributor Author

gwenya commented Apr 25, 2025

Hmh can we generate coverage data from the existing tests to see if they hit all the code paths in this package?

@stgraber
Copy link
Member

Hmm, not particularly easily as we typically run our tests across a LOT of incusd instances, so can't centrally keep track of everything that was hit. And some of the more complex bits (SR-IOV and the like) are tested externally, some manually, some automatically through Jenkins.

@gwenya
Copy link
Contributor Author

gwenya commented Apr 25, 2025

Hmh I see
The tests already caught some bugs that I'm fixing now but without knowing if everything is tested it's hard having confidence in the end result

@stgraber
Copy link
Member

@gwenya yeah, the best we can do is land this one early in the release to maximize the time we have for all our various tests to catch issues.

@gwenya gwenya force-pushed the netlink branch 12 times, most recently from cca145f to 0ef08ac Compare April 27, 2025 09:09
@gwenya gwenya force-pushed the netlink branch 2 times, most recently from 623d642 to e2d12a6 Compare April 27, 2025 09:36
_ = qdisc.Delete()
qdisc = &ip.Qdisc{Dev: veth, Ingress: true}
_ = qdisc.Delete()
qdiscIngress := &ip.QdiscIngress{Qdisc: ip.Qdisc{Dev: veth, Handle: "ffff:0"}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stgraber should I put these kinds of changes into their own commit or leave them together with the corresponding change in the ip package?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine to leave them with the change I think

vf = vfInfo // Found a match.
found = true
for _, vf := range link.Attrs().Vfs {
if vf.ID == vfID {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not confident about this ID being the number we want.

In the iproute2 code the vf is taken from IFLA_VF_MAC, and that's where the netlink library puts it when setting it, but this ID seems to be just the index in the list of vfs returned from the kernel.

I don't have any SR-IOV capable devices to try out if this is the same. If not then we might have to request a change on the netlink library to expose the correct value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to request access to https://discuss.linuxcontainers.org/t/opening-up-the-linux-containers-lab-environment/22588

The lab has quite a few SR-IOV capable systems:

  • lantea (dual-port ConnectX-4, single-port Intel 10Gbit)
  • asuras (dual-port ConnectX-6)
  • entak (dual-port ConnectX-6)
  • velona (dual-port ConnectX-6)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of those at least lantea is known to be correctly configured at the firmware level as it does run SR-IOV daily tests for us.

@gwenya
Copy link
Contributor Author

gwenya commented Apr 27, 2025

The vlan link type is blocked on vishvananda/netlink#1078, since the netlink library currently does not support setting the gvrp flag on vlan links.

The veth link type is semi-blocked on vishvananda/netlink#1079 since the netlink library currently does not support setting more than peer name and address on veth create, but we could probably work around that by adjusting the other settings after creating it.

gwenya added 15 commits April 27, 2025 21:30
…g netlink ourselves (incomplete)

Signed-off-by: Gwendolyn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants