Skip to content

feat: deploy nri-device-injector and rtx vgpu CCCs#439

Closed
ferrarimarco wants to merge 5 commits into
n4-8-cccfrom
int-nri-device-injector
Closed

feat: deploy nri-device-injector and rtx vgpu CCCs#439
ferrarimarco wants to merge 5 commits into
n4-8-cccfrom
int-nri-device-injector

Conversation

@ferrarimarco
Copy link
Copy Markdown
Member

@ferrarimarco ferrarimarco commented May 22, 2026

Configure fractional NVIDIA RTX 6000 Pro custom ComputeClasses.

nri-device-injector is a DaemonSet that dynamically configures GPUs in multi-instance GPU mode by attaching virtual GPUs to Pods requesting it. It has an affinity rule configured to only deploy on nodes that provide GPUs that need it:

  • nvidia-h100-80gb
  • nvidia-h100-mega-80gb
  • nvidia-rtx-pro-6000

- Update set_environment_variables.sh scripts to use SCRIPT_SOURCE
  with fallback to $0 for compatibility with Zsh and other shells.
- Replace Bash-specific lowercase conversion with portable 'tr'.
- Rename NVIDIA initialization to NVIDIA NGC initialization in docs.
- Document missing cpu-e2-s-16 in platform resources.
- Add k6-benchmark container image with Dockerfile and entrypoint
- Include metrics extraction utility and dependencies for k6
- Add Google Cloud Build trigger for k6-benchmark image
- Update .gitignore to ignore k6 generated report files
Add a n4-8 custom ComputeClass running n4-standard-8 nodes. Differently
from the existing n4-s-8 custom ComputeClass, n4-8 doesn't fall back to
spot capacity, making it useful for workloads that don't tolerate
disruptions caused by Spot VM node pre-emption.
nri-device-injector is a DaemonSet that dynamically configures GPUs in
multi-instance GPU mode by attaching virtual GPUs to Pods requesting it.
It has an affinity rule configured to only deploy on nodes that provide
GPUs that need it:

- nvidia-h100-80gb
- nvidia-h100-mega-80gb
- nvidia-rtx-pro-6000
@ferrarimarco
Copy link
Copy Markdown
Member Author

You can skip reviewing the RTX CCCs because they were already reviewed in #440

Comment thread platforms/gke/base/core/workloads/nri_device_injector/project.tf
@ferrarimarco
Copy link
Copy Markdown
Member Author

Build failures are unrelated to this change.

@ferrarimarco ferrarimarco changed the title feat: deploy nri-device-injector feat: deploy nri-device-injector and rtx vgpu CCCs May 29, 2026
@ferrarimarco ferrarimarco changed the base branch from int-n4-8-ccc to n4-8-ccc May 29, 2026 08:48
@ferrarimarco
Copy link
Copy Markdown
Member Author

Duplicate of #449

@ferrarimarco ferrarimarco marked this as a duplicate of #449 May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants