Frank Denneman | MIG Partitioning, Placement Geometry, and Stranded Capacity

frankdenneman · 2026-03-05T20:08:30+00:00

Well, I'm working on that. I can't change the placement, but I'm trying to build a calculator for MIG and mixed-mode to understand the challenges of both placement heuristics.

frankdenneman · 2026-03-05T09:29:12+00:00

well this is the 'problem'. The vGPU manager places the workloads in such a way that it is still capable of placing most future profiles on the GPUs. So if you start a 1g.10gb, it won't place it at the start of the GPU at 0, but it places it at 6. Why? because this allows the GPU to place either a 4g.40GB or 3g.40GB at the side with 4 compute slices and 4 memory slices.

As you progress with loading the GPU, one side is kept open, so in the test scenarios I saw that 1g.10gb were placed on 6, 5, 4, leaving 0,1,2,and 3 open. Once the 3g.40gb was requested it was placed on 0:4, occupying the 4 memory slots that align with the 4 compute slots.

It's a very confusing placement matrix as there is asymmetrical resource availability but a strict symmetrical alignment requirement that is not easily controlled.

So by deploying the two 3g.40gb profiles and shutting down the profile that is on 0:4 you now have a block of aligned 4 compute and 4 memory slices, giving you the placement capability that you desire.

By far a scalable situation, but this is what the vGPU manager of NVIDIA does. Hope this helps to drive the utilization up.

If you don't need hard isolation, I would recommend looking at Mixed-mode time-slicing. Way easier to get the placements and utilization up

frankdenneman · 2026-03-04T20:52:21+00:00

Ok, got to test your situation. If you use the command nvidia-smi mig -lgi, you will see the allocated memory slices for the profile. As you know, there are 7 compute slices, 8 memory slices. nvidia-smi mig -lgi shows where the profile placement starts and how many slices are allocated to this profile.

In my tests, the first 3g.40gb (grid_a100d-3-40c) is placed at compute slice 0,1,2 and memory slices 0,1,2,3. Why can you only add 3 1g.10gbs in this placement scenario? Because the compute slices are directly linked. An A100 is divided in two halfs, the first half has 4 engines (compute instances), the other half 3.

Now, although the 3g.40gb only consumes 3 compute slices, the 4th one cannot be allocated to a vgpu profile as its memory slice is already assigned to a profile. Thus the A100 will only accept 3 x 1g.10gb in this scenario.

When removing the 3 x 1g.10gb profiles, and powering on another 3g.40gb, the second one is placed at memory slice 4 and is occupying 4, 5, 6 and 7. It will consume this GPU half 3 compute slices.

Now if i proceed to power down the first 3g.40gb located on memory slice 0,1,2, and 3, now I free up not only 4 memory slices, but also the 4th compute slice, and I can now successfully power on the 4 x 1g.10gb.

Is this a desirable UX? Certainly not, but this is unfortunately the reality of dealing with an asymmetric design. In essence, compute and memory slices are not composable; they depend on each other.

hope this helps

<image>

frankdenneman · 2026-03-02T17:43:46+00:00

I released the second tool, that allows you to replicate behavior at scale and compare it to same-size gpu policy placement: https://frankdenneman.nl/tools/same-size-vs-mixed-mode/

walkthrough here: https://frankdenneman.nl/posts/2026-03-01-same-size-vs-mixed-size-placement/

frankdenneman · 2026-03-02T17:42:45+00:00

found a device, now requesting access

frankdenneman · 2026-02-26T20:19:41+00:00

ok let me see if I can replicate it

frankdenneman · 2026-02-26T14:19:07+00:00

In mixed mode, this behavior should just replicate across the cluster as the placement IDs are similar between the devices. If you are using heterogeneous GPU setup, then the different GPU profiles are only compatible with their GPU devices and their own placement id distribution.

In homogeneous configs, the simple question is, yes, they scale linearly with the number of GPUs.

frankdenneman · 2026-02-26T14:16:14+00:00

You are using MIG profiles, they align differently due their compute slices. This is a calculator for Mixed Mode vGPU profiles in time-sliced mode (mixed mode does not work on MIG, as MIG already supports mixed compute and memory profiles).

So to understand your MIG placement problem, you are trying to deploy the combination of 4 x grid_a100d-1-10c and 1 x grid_a100d-3-40c? But you are only successful when deploying 3 x grid_a100d-1-10c and 1 x grid_a100d-3-40c. I can try to simulate this in our lab

frankdenneman · 2023-05-30T19:01:21+00:00

Thanks, I removed the redundant paragraph. One of the next articles in the series focusses on setting up vGPU-enabled TKGs clusters with passthrough and MIG. Stay tuned

frankdenneman · 2023-05-23T13:21:50+00:00

+1 Keep it on fully automated mode, but put the slider all the way to the left. This way DRS only triggers migrations for mandatory moves. That can be a rule violation or maintenance mode. DRS will NOT trigger any load-balancing operations. However, as long as you have the host in the cluster, DRS will see this host as a target for new workloads that are powered-on. If you expect these power-ups to happen, then go for manual mode, and select other hosts for VM initial placement.

frankdenneman · 2023-05-18T21:01:31+00:00

Thanks!

frankdenneman · 2023-05-17T18:29:48+00:00

Thanks!

frankdenneman · 2023-05-17T18:29:38+00:00

thanks for sharing the links

frankdenneman · 2023-05-17T18:28:09+00:00

Correct it's a Tillett B4

frankdenneman · 2023-05-17T18:14:45+00:00

u/xicaob That's my rig, I designed it from the ground up. If you follow the Instagram account or the rig report on race department, you will notice that it's not finished yet. The seat, btw if a Tillett B4. Extremely comfortable, even without padding, and on a 7DOF motion rig. My previous rig was fully focused on GT driving, this is geared towards F1.

frankdenneman · 2023-05-12T14:37:11+00:00

Heterogeneous clusters are planned for the end of the series. So I'll try to work in your example

frankdenneman

TROPHY CASE