diff --git a/gpu-operator/dra-cds.rst b/gpu-operator/dra-cds.rst
new file mode 100644
index 000000000..0738426aa
--- /dev/null
+++ b/gpu-operator/dra-cds.rst
@@ -0,0 +1,232 @@
+.. license-header
+ SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ SPDX-License-Identifier: Apache-2.0
+
+##########################
+NVIDIA DRA Driver for GPUs
+##########################
+
+.. _dra_docs_compute_domains:
+
+********************************************
+ComputeDomains: Multi-Node NVLink simplified
+********************************************
+
+Motivation
+==========
+
+NVIDIA's `GB200 NVL72 `_ and comparable systems are designed specifically around Multi-Node NVLink (`MNNVL `_) to turn a rack of GPU machines -- each with a small number of GPUs -- into a supercomputer with a large number of GPUs communicating at high bandwidth (1.8 TB/s chip-to-chip, and over `130 TB/s cumulative bandwidth `_ on a GB200 NVL72).
+
+NVIDIA's DRA Driver for GPUs enables MNNVL for Kubernetes workloads by introducing a new concept -- the **ComputeDomain**:
+when workload requests a ComputeDomain, NVIDIA's DRA Driver for GPUs performs all the heavy lifting required for sharing GPU memory securely via NVLink among all pods that comprise the workload.
+
+.. note::
+
+ Users may appreciate to know that -- under the hood -- NVIDIA Internode Memory Exchange (`IMEX `_) primitives need to be orchestrated for mapping GPU memory over NVLink.
+
+ A design goal of this DRA driver is to make IMEX, as much as possible, an implementation detail that workload authors and cluster operators do not need to be concerned with: the driver launches and/or reconfigures IMEX daemons and establishes and injects IMEX channels into containers as needed.
+
+
+.. _dra-docs-cd-guarantees:
+
+Guarantees
+==========
+
+By design, an individual ComputeDomain guarantees
+
+#. **MNNVL-reachability** between pods that are in the domain.
+#. **secure isolation** from other pods that are not in the domain and in a different Kubernetes namespace.
+
+In terms of lifetime, a ComputeDomain is ephemeral: its lifetime is bound to the lifetime of the consuming workload.
+In terms of placement, our design choice is that a ComputeDomain follows the workload.
+
+That means: once workload pods get scheduled onto specific nodes, if they request a ComputeDomain, that domain automatically forms around them.
+Upon workload completion, all ComputeDomain-associated resources get torn down automatically.
+
+For more detail on the security properties of a ComputeDomain, see `Security `__.
+
+
+A deeper dive: related resources
+================================
+
+For more background on how ComputeDomains facilitate orchestrating MNNVL workloads on Kubernetes, see `this doc `_ and `this slide deck `_.
+For an outlook on planned improvements on the ComputeDomain concept, please refer to `this document `_.
+
+Details about IMEX and its relationship to NVLink may be found in `NVIDIA's IMEX guide `_, and in `NVIDIA's NVLink guide `_.
+CUDA API documentation for `cuMemCreate `_ provides a starting point to learn about how to share GPU memory via IMEX/NVLink.
+If you are looking for a higher-level GPU communication library, `NVIDIA's NCCL `_ newer than version 2.25 supports MNNVL.
+
+
+Usage example: a multi-node nvbandwidth test
+============================================
+
+This example demonstrates how to run an MNNVL workload across multiple nodes using a ComputeDomain (CD).
+As example CUDA workload that performs MNNVL communication, we have picked `nvbandwidth `_.
+Since nvbandwidth requires MPI, below we also install the `Kubeflow MPI Operator `_.
+
+**Steps:**
+
+#. Install the MPI Operator.
+
+ .. code-block:: console
+
+ $ kubectl create -f https://github.com/kubeflow/mpi-operator/releases/download/v0.6.0/mpi-operator.yaml
+
+#. Create a test job file called ``nvbandwidth-test-job.yaml``.
+ To do that, follow `this part of the CD validation instructions `_.
+ This example is configured to run across two nodes, using four GPUs per node.
+ If you want to use different numbers, please adjust the parameters in the spec according to the table below:
+
+ .. list-table::
+ :header-rows: 1
+
+ * - Parameter
+ - Value (in example)
+
+ * - ``ComputeDomain.spec.numNodes``
+ - Total number of nodes to use in the test (2).
+
+ * - ``MPIJob.spec.slotsPerWorker``
+ - Number of GPUs per node to use -- this must match the ``ppr`` number below (4).
+
+ * - ``MPIJob.spec.mpiReplicaSpecs.Worker.replicas``
+ - Also set this to the number of nodes (2).
+
+ * - ``mpirun`` command argument ``-ppr:4:node``
+ - Set this to the number of GPUs to use per node (4)
+
+ * - ``mpirun`` command argument ``-np`` value
+ - Set this to the total number of GPUs in the test (8).
+
+#. Apply the manifest.
+
+ .. code-block:: console
+
+ $ kubectl apply -f nvbandwidth-test-job.yaml
+
+ *Example Output*
+
+ .. code-block:: output
+
+ computedomain.resource.nvidia.com/nvbandwidth-test-compute-domain configured
+ mpijob.kubeflow.org/nvbandwidth-test configured
+
+#. Verify that the nvbandwidth pods were created.
+
+ .. code-block:: console
+
+ $ kubectl get pods
+
+ *Example Output*
+
+ .. code-block:: output
+
+ NAME READY STATUS RESTARTS AGE
+ nvbandwidth-test-launcher-lzv84 1/1 Running 0 8s
+ nvbandwidth-test-worker-0 1/1 Running 0 15s
+ nvbandwidth-test-worker-1 1/1 Running 0 15s
+
+
+#. Verify that the ComputeDomain pods were created for each node.
+
+ .. code-block:: console
+
+ $ kubectl get pods -n nvidia-dra-driver-gpu -l resource.nvidia.com/computeDomain
+
+ *Example Output*
+
+ .. code-block:: output
+
+ NAME READY STATUS RESTARTS AGE
+ nvbandwidth-test-compute-domain-ht24d-9jhmj 1/1 Running 0 20s
+ nvbandwidth-test-compute-domain-ht24d-rcn2c 1/1 Running 0 20s
+
+#. Verify the nvbandwidth test output.
+
+ .. code-block:: console
+
+ $ kubectl logs --tail=-1 -l job-name=nvbandwidth-test-launcher
+
+ *Example Output*
+
+ .. code-block:: output
+
+ Warning: Permanently added '[nvbandwidth-test-worker-0.nvbandwidth-test.default.svc]:2222' (ECDSA) to the list of known hosts.
+ Warning: Permanently added '[nvbandwidth-test-worker-1.nvbandwidth-test.default.svc]:2222' (ECDSA) to the list of known hosts.
+ [nvbandwidth-test-worker-0:00025] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
+
+ [...]
+
+ [nvbandwidth-test-worker-1:00025] MCW rank 7 bound to socket 0[core 3[hwt 0]]: [./././B/./././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././.][./././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././.]
+ nvbandwidth Version: v0.7
+ Built from Git version: v0.7
+
+ MPI version: Open MPI v4.1.4, package: Debian OpenMPI, ident: 4.1.4, repo rev: v4.1.4, May 26, 2022
+ CUDA Runtime Version: 12080
+ CUDA Driver Version: 12080
+ Driver Version: 570.124.06
+
+ Process 0 (nvbandwidth-test-worker-0): device 0: HGX GB200 (00000008:01:00)
+ Process 1 (nvbandwidth-test-worker-0): device 1: HGX GB200 (00000009:01:00)
+ Process 2 (nvbandwidth-test-worker-0): device 2: HGX GB200 (00000018:01:00)
+ Process 3 (nvbandwidth-test-worker-0): device 3: HGX GB200 (00000019:01:00)
+ Process 4 (nvbandwidth-test-worker-1): device 0: HGX GB200 (00000008:01:00)
+ Process 5 (nvbandwidth-test-worker-1): device 1: HGX GB200 (00000009:01:00)
+ Process 6 (nvbandwidth-test-worker-1): device 2: HGX GB200 (00000018:01:00)
+ Process 7 (nvbandwidth-test-worker-1): device 3: HGX GB200 (00000019:01:00)
+
+ Running multinode_device_to_device_memcpy_read_ce.
+ memcpy CE GPU(row) -> GPU(column) bandwidth (GB/s)
+ 0 1 2 3 4 5 6 7
+ 0 N/A 798.02 798.25 798.02 798.02 797.88 797.73 797.95
+ 1 798.10 N/A 797.80 798.02 798.02 798.25 797.88 798.02
+ 2 797.95 797.95 N/A 797.73 797.80 797.95 797.95 797.65
+ 3 798.10 798.02 797.95 N/A 798.02 798.10 797.88 797.73
+ 4 797.80 798.02 798.02 798.02 N/A 797.95 797.80 798.02
+ 5 797.80 797.95 798.10 798.10 797.95 N/A 797.95 797.88
+ 6 797.73 797.95 798.10 798.02 797.95 797.88 N/A 797.80
+ 7 797.88 798.02 797.95 798.02 797.88 797.95 798.02 N/A
+
+ SUM multinode_device_to_device_memcpy_read_ce 44685.29
+
+ NOTE: The reported results may not reflect the full capabilities of the platform.
+
+#. Clean up.
+
+ .. code-block:: console
+
+ $ kubectl delete -f nvbandwidth-test-job.yaml
+
+.. _dra-docs-cd-security:
+
+Security
+========
+
+As indicated in `Guarantees `__, the ComputeDomain primitive provides a *security boundary.* That deserves clarifying remarks.
+
+NVLink enables mapping remote GPU memory so that it can be read from / written to with regular CUDA API calls (as if it were normal, local GPU memory).
+From a security point of view, that begs the question: can any other GPU in the same NVLink parition freely read and mutate other GPU's memory -- or is there an authorization layer inbetween?
+The answer is "yes":
+IMEX has been introduced specifically as a means for providing secure isolation between GPUs that are in the same NVLink partition.
+With IMEX, every individual GPU memory export/import operation can be subject to fine-grained access control.
+
+With the following two additional constraints, we can now better understand the security guarantee provided by ComputeDomains:
+
+- The ComputeDomain security boundary is implemented with IMEX.
+- A job submitted to Kubernetes namespace `A` cannot be part of a ComputeDomain created for namespace `B`.
+
+
+That is, ComputeDomains (only) promise robust IMEX-based isolation between jobs that are **not** part of the same Kubernetes namespace.
+If a bad actor has access to a Kubernetes namespace, they may be able to mutate ComputeDomains (and, as such, IMEX primitives) in that Kubernetes namespace.
+That, in turn, may allow for disabling or trivially working around IMEX access control.
+
+
+With ComputeDomains, the overall ambition is that the security isolation between jobs in different Kubernetes namespaces is strong enough to responsibly allow for multi-tenant environments where compute jobs that conceptually cannot trust each other are "only" separated by the Kubernetes namespace boundary.
+
+
+Additional remarks
+==================
+
+We are planning to extend the documentation for ComputeDomains, with a focus on API reference documentation and known limitations as well as best practices and security.
+
+As we iterate on design and implementation, we are particularly interested and open to receiving your feedback -- please reach out via the issue tracker or discussion forum in the `GitHub repository `_.
diff --git a/gpu-operator/dra-gpus.rst b/gpu-operator/dra-gpus.rst
new file mode 100644
index 000000000..e44178216
--- /dev/null
+++ b/gpu-operator/dra-gpus.rst
@@ -0,0 +1,33 @@
+.. license-header
+ SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ SPDX-License-Identifier: Apache-2.0
+
+##########################
+NVIDIA DRA Driver for GPUs
+##########################
+
+.. _dra_docs_gpus:
+
+**************
+GPU allocation
+**************
+
+Compared to `traditional GPU allocation `_ using coarse-grained count-based requests, the GPU allocation side of this driver enables fine-grained control and powerful features long desired by the community, such as:
+
+#. Controlled sharing of individual GPUs between multiple pods and/or containers.
+#. GPU selection via complex constraints expressed via `CEL `_.
+#. Dynamic partitioning.
+
+To learn more about this part of the driver and about what we are planning to build in the future, have a look at `these release notes `_.
+
+While the GPU allocation features of this driver can be tried out, they are not yet officially supported.
+Hence, the GPU kubelet plugin is currently disabled by default in the Helm chart installation.
+
+For documentation on how to use and test the current set of GPU allocation features, please head over to the `demo section `_ of the driver's README and to its `quickstart directory `_.
+
+.. note::
+ This part of the NVIDIA DRA Driver for GPUs is in **Technology Preview**.
+ They are not yet supported in production environments and are not functionally complete.
+ Technology Preview features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
+ These releases may not have full documentation, and testing is limited.
+
diff --git a/gpu-operator/dra-intro-install.rst b/gpu-operator/dra-intro-install.rst
new file mode 100644
index 000000000..805d82cf2
--- /dev/null
+++ b/gpu-operator/dra-intro-install.rst
@@ -0,0 +1,110 @@
+.. license-header
+ SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ SPDX-License-Identifier: Apache-2.0
+
+##########################
+NVIDIA DRA Driver for GPUs
+##########################
+
+************
+Introduction
+************
+
+With NVIDIA's DRA Driver for GPUs, your Kubernetes workload can allocate and consume the following two types of resources:
+
+* **GPUs**: for controlled sharing and dynamic reconfiguration of GPUs. A modern replacement for the traditional GPU allocation method (using `NVIDIA's device plugin `_). We are excited about this part of the driver; it is however not yet fully supported (Technology Preview).
+* **ComputeDomains**: for robust and secure Multi-Node NVLink (MNNVL) for NVIDIA GB200 and similar systems. Fully supported.
+
+A primer on DRA
+===============
+
+Dynamic Resource Allocation (DRA) is a novel concept in Kubernetes for flexibly requesting, configuring, and sharing specialized devices like GPUs.
+DRA puts device configuration and scheduling into the hands of device vendors via drivers like this one.
+For NVIDIA devices, there are two particularly benefical characteristics provided by DRA:
+
+#. A clean way to allocate **cross-node resources** in Kubernetes (leveraged here for providing NVLink connectivity across pods running on multiple nodes).
+#. Mechanisms to explicitly **share, partition, and reconfigure** devices **on-the-fly** based on user requests (leveraged here for advanced GPU allocation).
+
+To understand and make best use of NVIDIA's DRA Driver for GPUs, we recommend becoming familiar with DRA by working through the `official documentation `_.
+
+
+The twofold nature of this driver
+=================================
+
+NVIDIA's DRA Driver for GPUs is comprised of two subsystems that are largely independent of each other: one manages GPUs, and the other one manages ComputeDomains.
+
+Below, you can find instructions for how to install both parts or just one of them.
+Additionally, we have prepared two separate documentation chapters, providing more in-depth information for each of the two subsystems:
+
+- :ref:`Documentation for ComputeDomain (MNNVL) support `
+- :ref:`Documentation for GPU support `
+
+
+************
+Installation
+************
+
+Prerequisites
+=============
+
+- Kubernetes v1.32 or newer.
+- DRA and corresponding API groups must be enabled (`see Kubernetes docs `_).
+- GPU Driver 565 or later.
+- NVIDIA's GPU Operator v25.3.0 or later, installed with CDI enabled (use the ``--set cdi.enabled=true`` commandline argument during ``helm install``). For reference, please refer to the GPU Operator `installation documentation `__.
+
+..
+ For convenience, the following example shows how to enable CDI upon GPU Operator installation:
+ .. code-block:: console
+ $ helm install --wait --generate-name \
+ -n gpu-operator --create-namespace \
+ nvidia/gpu-operator \
+ --version=${version} \
+ --set cdi.enabled=true
+
+.. note::
+
+ If you want to use ComputeDomains and a pre-installed NVIDIA GPU Driver:
+
+ - Make sure to have the corresponding ``nvidia-imex-*`` packages installed.
+ - Disable the IMEX systemd service before installing the GPU Operator.
+ - Refer to the `docs on installing the GPU Operator with a pre-installed GPU driver `__.
+
+
+Configure and Helm-install the driver
+=====================================
+
+#. Add the NVIDIA Helm repository:
+
+ .. code-block:: console
+
+ $ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
+ && helm repo update
+
+#. Install the driver, providing install-time configuration parameters. Example:
+
+ .. code-block:: console
+
+ $ helm install nvidia-dra-driver-gpu nvidia/nvidia-dra-driver-gpu \
+ --version="25.3.0" \
+ --create-namespace \
+ --namespace nvidia-dra-driver-gpu \
+ --set nvidiaDriverRoot=/run/nvidia/driver \
+ --set resources.gpus.enabled=false
+
+All install-time configuration parameters can be listed by running ``helm show values nvidia/nvidia-dra-driver-gpu``.
+
+.. note::
+
+ - A common mode of operation for now is to enable only the ComputeDomain subsystem (to have GPUs allocated using the traditional device plugin). The example above achieves that by setting ``resources.gpus.enabled=false``.
+ - Setting ``nvidiaDriverRoot=/run/nvidia/driver`` above expects a GPU Operator-provided GPU driver. That configuration parameter must be changed in case the GPU driver is installed straight on the host (typically at ``/``, which is the default value for ``nvidiaDriverRoot``).
+ - In a future release, NVIDIA's DRA Driver for GPUs will be bundled with the NVIDIA GPU Operator (and does not need to be installed as a separate Helm chart anymore).
+
+
+Validate installation
+=====================
+
+We recommend to perform validation steps to confirm that your setup works as expected.
+To that end, we have prepared separate documentation:
+
+- `Testing ComputeDomain allocation `_
+- [TODO] Testing GPU allocation
diff --git a/gpu-operator/index.rst b/gpu-operator/index.rst
index 678d1b7f9..4455331c9 100644
--- a/gpu-operator/index.rst
+++ b/gpu-operator/index.rst
@@ -75,4 +75,13 @@
Azure AKS
Google GKE
+.. toctree::
+ :caption: NVIDIA DRA Driver for GPUs
+ :titlesonly:
+ :hidden:
+
+ Introduction & Installation
+ GPUs
+ ComputeDomains
+
.. include:: overview.rst
diff --git a/gpu-operator/manifests/input/dra-compute-domain-crd.yaml b/gpu-operator/manifests/input/dra-compute-domain-crd.yaml
new file mode 100644
index 000000000..924563d74
--- /dev/null
+++ b/gpu-operator/manifests/input/dra-compute-domain-crd.yaml
@@ -0,0 +1,9 @@
+apiVersion: resource.nvidia.com/v1beta1
+kind: ComputeDomain
+metadata:
+ name: imex-channel-injection
+spec:
+ numNodes: 1
+ channel:
+ resourceClaimTemplate:
+ name: imex-channel-0
diff --git a/gpu-operator/manifests/input/imex-channel-injection.yaml b/gpu-operator/manifests/input/imex-channel-injection.yaml
new file mode 100644
index 000000000..f812dd47d
--- /dev/null
+++ b/gpu-operator/manifests/input/imex-channel-injection.yaml
@@ -0,0 +1,28 @@
+---
+apiVersion: resource.nvidia.com/v1beta1
+kind: ComputeDomain
+metadata:
+ name: imex-channel-injection
+spec:
+ numNodes: 1
+ channel:
+ resourceClaimTemplate:
+ name: imex-channel-0
+---
+apiVersion: v1
+kind: Pod
+metadata:
+ name: imex-channel-injection
+spec:
+ containers:
+ - name: ctr
+ image: ubuntu:22.04
+ command: ["bash", "-c"]
+ args: ["ls -la /dev/nvidia-caps-imex-channels; trap 'exit 0' TERM; sleep 9999 & wait"]
+ resources:
+ claims:
+ - name: imex-channel-0
+ resourceClaims:
+ - name: imex-channel-0
+ resourceClaimTemplateName: imex-channel-0
+
diff --git a/gpu-operator/manifests/input/kubeadm-init-config.yaml b/gpu-operator/manifests/input/kubeadm-init-config.yaml
new file mode 100644
index 000000000..e913e464d
--- /dev/null
+++ b/gpu-operator/manifests/input/kubeadm-init-config.yaml
@@ -0,0 +1,21 @@
+apiVersion: kubeadm.k8s.io/v1beta4
+kind: ClusterConfiguration
+apiServer:
+ extraArgs:
+ - name: "feature-gates"
+ value: "DynamicResourceAllocation=true"
+ - name: "runtime-config"
+ value: "resource.k8s.io/v1beta1=true"
+controllerManager:
+ extraArgs:
+ - name: "feature-gates"
+ value: "DynamicResourceAllocation=true"
+scheduler:
+ extraArgs:
+ - name: "feature-gates"
+ value: "DynamicResourceAllocation=true"
+---
+apiVersion: kubelet.config.k8s.io/v1beta1
+kind: KubeletConfiguration
+featureGates:
+ DynamicResourceAllocation: true
diff --git a/repo.toml b/repo.toml
index 8e47ee289..15df2ddf1 100644
--- a/repo.toml
+++ b/repo.toml
@@ -265,4 +265,4 @@ copyright_start = 2024
[repo_docs.projects.secure-services-istio-keycloak.builds.linkcheck]
build_by_default = false
-output_format = "linkcheck"
\ No newline at end of file
+output_format = "linkcheck"