-
Notifications
You must be signed in to change notification settings - Fork 435
Open
Labels
Milestone
Description
Summary
On Jetson(aarch64, Tegra SoC) devices, version 1.17.1 is not creating containers properly, if environment variable NVIDIA_DRIVER_CAPABILITIES contains any of display,graphics,all value.
This could be mitigated by overriding container env, for example docker run -e NVIDIA_DRIVER_CAPABILITIES=compute nvcr.io/....
Steps to reproduce
-
Get a Jetson device. I tested with {Xavier, Orin} AGX DevKit as a reference.
-
Install
Docker runtimeandnvidia-container-runtime=1.17.1-1 -
Ensure nvidia container runtime has configured. To configure, run
sudo nvidia-ctk runtime configure --set-as-default -
Try running a container. For example, l4t-base image could be used. For example:
docker run -it --rm \ -e NVIDIA_DRIVER_CAPABILITIES=all \ nvcr.io/nvidia/l4t-base:r36.2.0OR, even with non-jetson base images:
docker run -it --rm \ -e NVIDIA_DRIVER_CAPABILITIES=display \ -e NVIDIA_VISIBLE_DEVICES=all \ ubuntu:22.04
Result
Example of error message
$ docker run -it --rm -e NVIDIA_DRIVER_CAPABILITIES=display -e NVIDIA_VISIBLE_DEVICES=all ubuntu:22.04
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: time="2024-11-13T17:38:55+09:00" level=info msg="Symlinking /var/lib/docker/overlay2/8af1b1d84ee57db598be489bb9ad58fb2d139b77604aead77526787d18a02900/merged/etc/vulkan/icd.d/nvidia_icd.json to /usr/lib/aarch64-linux-gnu/tegra/nvidia_icd.json"
time="2024-11-13T17:38:55+09:00" level=error msg="failed to create link [/usr/lib/aarch64-linux-gnu/tegra/nvidia_icd.json /etc/vulkan/icd.d/nvidia_icd.json]: failed to create symlink: failed to remove existing file: remove /var/lib/docker/overlay2/8af1b1d84ee57db598be489bb9ad58fb2d139b77604aead77526787d18a02900/merged/etc/vulkan/icd.d/nvidia_icd.json: device or resource busy": unknown.| Hardware | Jetpack | nvidia-container-toolkit | NVIDIA_DRIVER_CAPABILITIES | result |
|---|---|---|---|---|
| Orin AGX | 6.1 | 1.14.2 | all | Good |
| Orin AGX | 6.1 | 1.17.1 | all | Error |
| Orin AGX | 6.1 | 1.17.1 | compute,utility | Good |
| Orin AGX | 6.1 | 1.17.1 | display | Error |
| Orin AGX | 6.1 | 1.17.1 | graphics | Error |
| Xavier AGX | 5.1.2 | 1.16.1 | all | Good |
| Xavier AGX | 5.1.2 | 1.16.1 | graphics | Good |
| Xavier AGX | 5.1.2 | 1.17.1 | all | Error |
| Xavier AGX | 5.1.2 | 1.17.1 | compute | Good |
| Xavier AGX | 5.1.2 | 1.17.1 | display | Error |
| Xavier AGX | 5.1.2 | 1.17.1 | graphics | Error |
robcowie, nyanlynn23, yureutaejin, ab-tools, kalkocz and 1 more