Skip to content

feat: deploy nri-device-injector and rtx cccs#449

Open
ferrarimarco wants to merge 1 commit into
mainfrom
rtx-cccs
Open

feat: deploy nri-device-injector and rtx cccs#449
ferrarimarco wants to merge 1 commit into
mainfrom
rtx-cccs

Conversation

@ferrarimarco
Copy link
Copy Markdown
Member

Configure fractional NVIDIA RTX 6000 Pro custom ComputeClasses.

nri-device-injector is a DaemonSet that dynamically configures GPUs in multi-instance GPU mode by attaching virtual GPUs to Pods requesting it. It has an affinity rule configured to only deploy on nodes that provide GPUs that need it:

  • nvidia-h100-80gb
  • nvidia-h100-mega-80gb
  • nvidia-rtx-pro-6000

feat: deploy nvidia rtx pro 6000 cccs (#440)

Copy link
Copy Markdown
Member

@fernandorubbo fernandorubbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern is that the daemonset deploys on the following hardware

              - key: cloud.google.com/gke-accelerator
                operator: In
                values:
                  - nvidia-rtx-pro-6000
                  - nvidia-h100-80gb
                  - nvidia-h100-mega-80gb

That means, it will install in all A3 and all G4 in the cluster. Even those that do not have vGPU

Comment thread platforms/gke/base/core/workloads/nri_device_injector/main.tf
@ferrarimarco
Copy link
Copy Markdown
Member Author

My only concern is that the daemonset deploys on the following hardware

              - key: cloud.google.com/gke-accelerator
                operator: In
                values:
                  - nvidia-rtx-pro-6000
                  - nvidia-h100-80gb
                  - nvidia-h100-mega-80gb

That means, it will install in all A3 and all G4 in the cluster. Even those that do not have vGPU

Correct, although it's a no-op on nodes that don't have a vGPU.

Base automatically changed from n4-8-ccc to main June 1, 2026 10:13
Configure fractional NVIDIA RTX 6000 Pro custom ComputeClasses.

nri-device-injector is a DaemonSet that dynamically configures GPUs in
multi-instance GPU mode by attaching virtual GPUs to Pods requesting it.
It has an affinity rule configured to only deploy on nodes that provide
GPUs that need it:

- nvidia-h100-80gb
- nvidia-h100-mega-80gb
- nvidia-rtx-pro-6000

feat: deploy nvidia rtx pro 6000 cccs (#440)
@ferrarimarco
Copy link
Copy Markdown
Member Author

Build failures are not related to this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants