You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each of the technologies, time-slicing, Multi-Process Service (MPS), and Multi-Instance GPU (MIG)
63
+
enable sharing a physical GPU with more than one workload.
64
+
65
+
NVIDIA A100 and newer GPUs provide an operation mode called MIG.
66
+
MIG enables you to partition a GPU into *slices*.
67
+
A slice is a smaller, predefined GPU instance that looks like a
63
68
mini-GPU that provides memory and fault isolation at the hardware layer.
64
69
You can share access to a GPU by running workloads on one of
65
70
these predefined instances instead of the full native GPU.
66
71
67
72
MIG support was added to Kubernetes in 2020. Refer to `Supporting MIG in Kubernetes <https://www.google.com/url?q=https://docs.google.com/document/d/1mdgMQ8g7WmaI_XVVRrCvHPFPOMCm5LQD5JefgAh6N8g/edit&sa=D&source=editors&ust=1655578433019961&usg=AOvVaw1F-OezvM-Svwr1lLsdQmu3>`_
68
73
for details on how this works.
69
74
70
-
Time-slicing trades the memory and fault-isolation that is provided by MIG
71
-
for the ability to share a GPU by a larger number of users.
75
+
NVIDIA V100 and newer GPUs support MPS.
76
+
MPS enables dividing a physical GPU into *replicas* and assigning workloads to a replica.
77
+
While MIG provides fault isolation in hardware, MPS uses software to divide the GPU into replicas.
78
+
Each replica receives an equal portion of memory and thread percentage.
79
+
For example, if you configure two replicas, each replica has access to 50% of GPU memory and 50% of compute capacity.
80
+
81
+
Time-slicing is available with all GPUs supported by the Operator.
82
+
Unlike MIG, time-slicing has no special memory or fault-isolation.
83
+
Like MPS, time-slicing uses the term *replica*, however, the GPU is not divided between workloads.
84
+
The GPU performs a context switch and swaps resources on and off the GPU when a workload is scheduled.
85
+
72
86
Time-slicing also provides a way to provide shared access to a GPU for
73
87
older generation GPUs that do not support MIG.
74
88
However, you can combine MIG and time-slicing to provide shared access to
@@ -234,15 +248,15 @@ The following table describes the key fields in the config map.
234
248
Applying One Cluster-Wide Configuration
235
249
=======================================
236
250
237
-
Perform the following steps to configure GPU time-slicing if you already installed the GPU operator
251
+
Perform the following steps to configure GPU time-slicing if you already installed the GPU Operator
238
252
and want to apply the same time-slicing configuration on all nodes in the cluster.
239
253
240
254
#. Create a file, such as ``time-slicing-config-all.yaml``, with contents like the following example:
0 commit comments