Introduction

Lots of conversations revolve around sizing models to GPUs. In enterprise environments, its looking at how many of what can be deployed where.

The Calculator

  • very simple focus on parameters, context length and batches
  • shows how many estimated MIGs could be used on NVidia GPUs to deliver the model
  • with how many MIGs, comes how many models

The Code

  • the code is in the embedded page (yes, all of it)
  • feel free to examine and improve!

The Tool

MIG Estimator
Parameter count:
Select Precision:
Context Length:
Number of Concurrent Sessions:
Select Target GPU: (optional)

Number of Transformer Layers:
Number of Attention Headers:
Head Dimension:
Select an example:

Amount of VRAM Required:  GB

GPU Name:

MIG Name:

Number of Instances: