-
Notifications
You must be signed in to change notification settings - Fork 133
nvme_driver: don't flr nvme devices #1714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR modifies the NVMe device initialization logic to disable Function Level Reset (FLR) during device attach/detach operations to improve system startup and shutdown performance. The change implements a fallback mechanism that attempts device initialization without FLR first, and only enables FLR if the initial attempt fails.
- Refactors NVMe device creation into separate functions with retry logic
- Adds reset method configuration to disable FLR by default
- Implements fallback to FLR if device initialization fails without reset
Comments suppressed due to low confidence (1)
openhcl/underhill_core/src/nvme_manager.rs:244
- [nitpick] The closure name 'update_reset' could be more descriptive. Consider renaming to 'set_device_reset_method' to better reflect its purpose.
let update_reset = |method: PciDeviceResetMethod| {
Bummer, this doesn't work yet:
|
This is expected with our NVMe emulator, which does not (currently) support any of Linux's reset methods. |
Got it. I see confirmation that this works when testing with a physical device. In addition, I have a draft PR to add FLR support (so we can test this in CI). |
The default
vfio
device behavior is to issue a function level reset when attaching or detaching devices. It does so because the device is in an unknown or untrusted state. However, within the context of a trusted virtualization stack, OpenHCL can reasonably trust the state and behavior of the device. So, optimize performance by removing these function level resets for nvme devices. This follows the same model as already exists for MANA devices.The
nvme_driver
already shuts down the device (seeNvmeDriver::reset()
) and waits for the device to become disabled. A well behaved nvme device will not issue DMA after this point. That same device should tolerate a graceful start without an FLR.Pending work before this PR is ready to commit: