system

System Keywords
Keyword	Type	Brief
max_gpu_memory_mb	std::optional<uint64_t>	Maximum amount of GPU memory to initially allocate in MB.
oversubscribe_gpus	bool	Allow multiple processes to use each GPU.
teams_per_node	uint32_t	Number of worker teams per node.
gpus_per_team	std::optional<uint32_t>	Number of GPUs per worker team.

max_gpu_memory_mb

Maximum amount of GPU memory to initially allocate per GPU in MB. If you request more memory than you have available, the code will crash.

oversubscribe_gpus

Allow multiple processes to use each GPU. Expert variable.

teams_per_node

Number of worker teams per node.

gpus_per_team

Number of GPUs per worker team.

Example

...
  "system":
  {
    "max_gpu_memory_mb": "24000",
    "teams_per_node": 4,
    "gpus_per_team": 1
  }

...

The above example will work in a system with 4 GPUs per node and if you’re using fragmentation. It will create 4 teams, each one with one GPU, each team will receive a set of fragments and execute them concurrently. This means that you can achieve a 4 times speedup if your fragments fit in a single GPU.

This is easily extended to multiple topologies, this results in the “batched mode” of EXESS where