Skip to content

Toolbox Run failed to get Container Device Interface containerEdits for NVIDIA #1694

@simon3z

Description

@simon3z

Describe the bug
The command toolbox run fails with "Error: failed to get Container Device Interface containerEdits for NVIDIA".

$ toolbox run ls
...
level=debug msg="Generating Container Device Interface for NVIDIA: failed to get containerEdits: failed to create 
discoverer for common entities: failed to create discoverer for driver files: failed to create discoverer for driver libraries: failed to get libraries for driver version: failed to locate libcuda.so.575.57.08: pattern libcuda.so.575.57.08 not found\nlibcuda.so.575.57.08: not found"
Error: failed to get Container Device Interface containerEdits for NVIDIA

My system has nvidia kernel drivers loaded:

$ lsmod | grep nvidia
nvidia_uvm           4218880  4
nvidia_drm            159744  5
nvidia_modeset       2150400  1 nvidia_drm
nvidia              12976128  9 nvidia_uvm,nvidia_modeset
drm_ttm_helper         16384  2 nvidia_drm,xe
video                  81920  4 asus_wmi,xe,i915,nvidia_modeset

But CUDA libraries are NOT installed.

Steps how to reproduce the behaviour

  1. Load nvidia kernel drivers
  2. Make sure not to have CUDA libraries installed
  3. Run a toolbox run command
  4. See error

Expected behaviour
The toolbox run command is expected to be executed regularly.

Actual behaviour
The command toolbox run fails with "Error: failed to get Container Device Interface containerEdits for NVIDIA"

Output of toolbox --version (v0.0.90+)
toolbox version 0.1.2

Toolbx package info (rpm -q toolbox)
toolbox-0.1.2-1.fc42.x86_64

Output of podman version
e.g.,

Client:        Podman Engine
Version:       5.5.2
API Version:   5.5.2
Go Version:    go1.24.4
Git Commit:    e7d8226745ba07a64b7176a7f128e4ef53225a0e
Built:         Tue Jun 24 02:00:00 2025
Build Origin:  Fedora Project
OS/Arch:       linux/amd64

Podman package info (rpm -q podman)
podman-5.5.2-1.fc42.x86_64

Info about your OS
Fedora Linux 42 (Workstation Edition)

Additional context
Note that toolbox create, toolbox enter, etc. all works perfectly fine.
A possible fix would be failing with an ErrPlatformUnsupported in GenerateCDISpec when CUDA libraries are not found in the host system so that the error is managed gracefully.

diff --git a/src/pkg/nvidia/nvidia.go b/src/pkg/nvidia/nvidia.go
index c2cfe19..5593d5b 100644
--- a/src/pkg/nvidia/nvidia.go
+++ b/src/pkg/nvidia/nvidia.go
@@ -108,7 +108,7 @@ func GenerateCDISpec() (*specs.Spec, error) {
        commonEdits, err := cdi.GetCommonEdits()
        if err != nil {
                logrus.Debugf("Generating Container Device Interface for NVIDIA: failed to get containerEdits: %s", err)
-               return nil, errors.New("failed to get Container Device Interface containerEdits for NVIDIA")
+               return nil, ErrPlatformUnsupported
        }
 
        spec, err := nvspec.New(nvspec.WithEdits(*commonEdits.ContainerEdits))

Metadata

Metadata

Assignees

No one assigned

    Labels

    1. BugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions