Skip to content

NVIDIA_DRIVER_CAPABILITIES=graphics is broken on Jetson devices (1.17.1 or later) #795

Open
@yeongrokgim

Description

@yeongrokgim

Summary

On Jetson(aarch64, Tegra SoC) devices, version 1.17.1 is not creating containers properly, if environment variable NVIDIA_DRIVER_CAPABILITIES contains any of display,graphics,all value.

This could be mitigated by overriding container env, for example docker run -e NVIDIA_DRIVER_CAPABILITIES=compute nvcr.io/....

Steps to reproduce

  1. Get a Jetson device. I tested with {Xavier, Orin} AGX DevKit as a reference.

  2. Install Docker runtime and nvidia-container-runtime=1.17.1-1

  3. Ensure nvidia container runtime has configured. To configure, run
    sudo nvidia-ctk runtime configure --set-as-default

  4. Try running a container. For example, l4t-base image could be used. For example:

    docker run -it --rm \
        -e NVIDIA_DRIVER_CAPABILITIES=all \
        nvcr.io/nvidia/l4t-base:r36.2.0

    OR, even with non-jetson base images:

    docker run -it --rm \
        -e NVIDIA_DRIVER_CAPABILITIES=display \
        -e NVIDIA_VISIBLE_DEVICES=all \
        ubuntu:22.04

Result

Example of error message

$ docker run -it --rm -e NVIDIA_DRIVER_CAPABILITIES=display -e NVIDIA_VISIBLE_DEVICES=all ubuntu:22.04

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: time="2024-11-13T17:38:55+09:00" level=info msg="Symlinking /var/lib/docker/overlay2/8af1b1d84ee57db598be489bb9ad58fb2d139b77604aead77526787d18a02900/merged/etc/vulkan/icd.d/nvidia_icd.json to /usr/lib/aarch64-linux-gnu/tegra/nvidia_icd.json"
time="2024-11-13T17:38:55+09:00" level=error msg="failed to create link [/usr/lib/aarch64-linux-gnu/tegra/nvidia_icd.json /etc/vulkan/icd.d/nvidia_icd.json]: failed to create symlink: failed to remove existing file: remove /var/lib/docker/overlay2/8af1b1d84ee57db598be489bb9ad58fb2d139b77604aead77526787d18a02900/merged/etc/vulkan/icd.d/nvidia_icd.json: device or resource busy": unknown.
Hardware Jetpack nvidia-container-toolkit NVIDIA_DRIVER_CAPABILITIES result
Orin AGX 6.1 1.14.2 all Good
Orin AGX 6.1 1.17.1 all Error
Orin AGX 6.1 1.17.1 compute,utility Good
Orin AGX 6.1 1.17.1 display Error
Orin AGX 6.1 1.17.1 graphics Error
Xavier AGX 5.1.2 1.16.1 all Good
Xavier AGX 5.1.2 1.16.1 graphics Good
Xavier AGX 5.1.2 1.17.1 all Error
Xavier AGX 5.1.2 1.17.1 compute Good
Xavier AGX 5.1.2 1.17.1 display Error
Xavier AGX 5.1.2 1.17.1 graphics Error

Metadata

Metadata

Assignees

Labels

bugIssue/PR to expose/discuss/fix a bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions