system
Keyword |
Type |
Brief |
---|---|---|
std::optional<uint64_t> |
Maximum amount of GPU memory to initially allocate in MB. |
|
bool |
Allow multiple processes to use each GPU. |
|
uint32_t |
Number of worker teams per node. |
|
std::optional<uint32_t> |
Number of GPUs per worker team. |
max_gpu_memory_mb
Maximum amount of GPU memory to initially allocate per GPU in MB. If you request more memory than you have available, the code will crash.
oversubscribe_gpus
Allow multiple processes to use each GPU. Expert variable.
teams_per_node
Number of worker teams per node.
gpus_per_team
Number of GPUs per worker team.
Example
...
"system":
{
"max_gpu_memory_mb": "24000",
"teams_per_node": 4,
"gpus_per_team": 1
}
...
The above example will work in a system with 4 GPUs per node and if you’re using fragmentation. It will create 4 teams, each one with one GPU, each team will receive a set of fragments and execute them concurrently. This means that you can achieve a 4 times speedup if your fragments fit in a single GPU.
This is easily extended to multiple topologies, this results in the “batched mode” of EXESS where