Skip to content

Releases: NVIDIA/k8s-device-plugin

v0.15.0-rc.2

18 Mar 11:48
Compare
Choose a tag to compare

What's changed

  • Bump CUDA base image version to 12.3.2
  • Add cdi-cri device list strategy. This uses the CDIDevices CRI field to request CDI devices instead of annotations.
  • Set MPS memory limit by device index and not device UUID. This is a workaround for an issue where
    these limits are not applied for devices if set by UUID.
  • Update MPS sharing to disallow requests for multiple devices if MPS sharing is configured.
  • Set mps device memory limit by index.
  • Explicitly set sharing.mps.failRequestsGreaterThanOne = true.
  • Run tail -f for each MPS daemon to output logs.
  • Enforce replica limits for MPS sharing.

v0.14.5

29 Feb 10:23
3d549fb
Compare
Choose a tag to compare

What's Changed

  • Update the nvidia-container-toolkit go dependency. This fixes a bug in CDI spec generation on systems were lib -> usr/lib symlinks exist.
  • Update the CUDA base images to 12.3.2

Full Changelog: v0.14.4...v0.14.5

v0.15.0-rc.1

26 Feb 13:59
Compare
Choose a tag to compare
v0.15.0-rc.1 Pre-release
Pre-release

What's Changed

  • Import GPU Feature Discovery into the GPU Device Plugin repo. This means that the same version and container image is used for both components.
  • Add tooling to create a kind cluster for local development and testing.
  • Update go-gpuallocator dependency to migrate away from the deprecated gpu-monitoring-tools NVML bindings.
  • Remove legacyDaemonsetAPI config option. This was only required for k8s versions < 1.16.
  • Add support for MPS sharing.
  • Bump CUDA base image version to 12.3.1

Full Changelog: v0.14.0...v0.15.0-rc.1

v0.14.4

29 Jan 14:42
cde1a66
Compare
Choose a tag to compare

What's Changed

  • Update to refactored go-gpuallocator code. This permanently fixes the NVML_NVLINK_MAX_LINKS value addressed in a
    hotfix in v0.14.3. This also addresses a bug due to uninitialized NVML when calling go-gpuallocator.

Full Changelog: v0.14.3...v0.14.4

v0.14.3

15 Nov 13:09
Compare
Choose a tag to compare

Bug fixes

  • Patched vendored NVML_NVLINK_MAX_LINKS to 18 to support devices with 18 NVLinks

Dependency updates

  • Bumped CUDA base images version to 12.3.0

Full Changelog: v0.14.2...v0.14.3

v0.14.2

20 Oct 09:59
Compare
Choose a tag to compare

This release bumps dependencies.

Dependency Updates

  • Updated CUDA Base Image to 12.2.2
  • Updated GPU Feature Discovery version to v0.8.2

Full Changelog: v0.14.1...v0.14.2

v0.14.1

13 Jul 09:35
Compare
Choose a tag to compare

This release fixes bugs and bumps dependencies.

Bug fixes

  • Fixed parsing of deviceListStrategy in device plugin config (#410)

Dependency Updates

  • Updated CUDA Base Image to 12.2.0
  • Update GPU Feature Discovery version to v0.8.1
  • Update Node Feature Discovery to v0.13.2
  • Updated Go dependencies.

Full Changelog: v0.14.0...v0.14.1

v0.14.0

03 Apr 21:09
Compare
Choose a tag to compare

Full Changelog: v0.13.0...v0.14.0

Changes

  • Promote v0.14.0-rc.3 to v0.14.0
  • Bumped nvidia-container-toolkit dependency to latest version for newer CDI spec generation code
  • Updated GFD subchart to version v0.8.0

Changes from v0.14.0-rc.3

  • Removed the --cdi-enabled config option and instead trigger CDI injection based on cdi-annotation strategy.
  • Bumped go-nvlib dependency to latest version to support new MIG profiles.
  • Added cdi-annotation-prefix config option to control how CDI annotations are generated.
  • Renamed driver-root-ctr-path config option added in v0.14.0-rc.1 to container-driver-root.
  • Updated GFD subchart to version v0.8.0-rc.2

Changes from v0.14.0-rc.2

  • Fix bug from v0.14.0-rc.1 when using cdi-enabled=false

Changes from v0.14.0-rc.1

  • Added --cdi-enabled flag to GPU Device Plugin. With this enabled, the device plugin will generate CDI specifications for available NVIDIA devices. Allocation will add CDI anntiations (cdi.k8s.io/*) to the response. These are read by a CDI-enabled runtime to make the required modifications to a container being created.
  • Updated GFD subchard to version 0.8.0-rc.1
  • Bumped Golang version to 1.20.1
  • Bumped CUDA base images version to 12.1.0
  • Switched to klog for logging
  • Added a static deployment file for Microshift

Note:

The container image nvcr.io/nvidia/k8s-device-plugin-v0.14.0-ubi8 contains the following high-severity CVEs:

  • CVE-2023-0286 - Vulnerability found in os package type (rpm) - openssl-libs
  • CVE-2023-24329 - Vulnerability found in os package type (rpm) - platform-python and python3-libs

v0.14.0-rc.3

29 Mar 12:56
Compare
Choose a tag to compare
v0.14.0-rc.3 Pre-release
Pre-release

Full Changelog: v0.14.0-rc.2...v0.14.0-rc.3

Changes

  • Removed the --cdi-enabled config option and instead trigger CDI injection based on cdi-annotation strategy.
  • Bumped go-nvlib dependency to latest version to support new MIG profiles.
  • Added cdi-annotation-prefix config option to control how CDI annotations are generated.
  • Renamed driver-root-ctr-path config option added in v0.14.0-rc.1 to container-driver-root.
  • Updated GFD subchart to version v0.8.0-rc.2

v0.14.0-rc.2

20 Mar 23:02
Compare
Choose a tag to compare
v0.14.0-rc.2 Pre-release
Pre-release

Full Changelog: v0.14.0-rc.1...v0.14.0-rc.2

Changes

  • Fix bug from v0.14.0-rc.1 when using cdi-enabled=false