system

System Keywords

Keyword

Type

Brief

max_gpu_memory_mb

std::optional<uint64_t>

Maximum amount of GPU memory to initially allocate in MB.

oversubscribe_gpus

bool

Allow multiple processes to use each GPU.

teams_per_node

uint32_t

Number of worker teams per node.

gpus_per_team

std::optional<uint32_t>

Number of GPUs per worker team.

max_gpu_memory_mb

Maximum amount of GPU memory to initially allocate per GPU in MB. If you request more memory than you have available, the code will crash.

oversubscribe_gpus

Allow multiple processes to use each GPU. Expert variable.

teams_per_node

Number of worker teams per node.

gpus_per_team

Number of GPUs per worker team.

Example

...
  "system":
  {
    "max_gpu_memory_mb": "24000",
    "teams_per_node": 4,
    "gpus_per_team": 1
  }

...

The above example will work in a system with 4 GPUs per node and if you’re using fragmentation. It will create 4 teams, each one with one GPU, each team will receive a set of fragments and execute them concurrently. This means that you can achieve a 4 times speedup if your fragments fit in a single GPU.

This is easily extended to multiple topologies, this results in the “batched mode” of EXESS where