-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precompiled Driver Container for Linux Kernel other than 5.15 does not exist #1203
Comments
@utsumi-fj When you try to deploy the precompiled driver containers to a newer Linux kernel, like |
@justinthelaw A non-existing container image, such as |
Tell me if this is a slightly different problem, but the gpu-operator does mutate the final image tag which causes it to be incorrect. This can be seen in the final driver daemonset resource, when described. It basically follows this pattern:
I have been working around this problem by directly patching the tag back to the actual image tag, and then restarting the pod that has the image pull error. I am still testing things, as I am currently producing a Kernel mismatch workaround, so this may not be the right way to do things. As for the Kernel mismatch, I am self-building the precompiled drivers container and modifying the Dockerfile and nvidia-driver script NVIDIA produces (via git sparse checkout and |
CC: @tariq1890 @cdesiniotis Do you guys have any further insight as to why these two issues are a pattern or behavior in the gpu-operator's precompiled drivers deployment? |
@utsumi-fj As a final follow-up, I got everything working on two Linux kernels, |
In the page for Precompiled Driver Containers https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/precompiled-drivers.html#limitations-and-restrictions, the following limitations and restrictions are described.
Although no restrictions about Linux Kernel version are described, Precompiled Driver Container for Linux Kernel other than 5.15 does not exist in https://catalog.ngc.nvidia.com/orgs/nvidia/containers/driver/tags.
According to the following kernel release schedule in https://ubuntu.com/kernel/lifecycle, newer Linux Kernel version e.g. 6.8 is available for Ubuntu 22.04.
Precompiled Driver Container is useful since it avoids installing local package repository when installing GPU Operator in Air-Gapped environment. So, it is better if Precompiled Driver Container for newer Linux Kernel exists.
The text was updated successfully, but these errors were encountered: