.. _system: --------------------- system --------------------- .. list-table:: System Keywords :header-rows: 1 * - Keyword - Type - Brief * - :ref:`max_gpu_memory_mb` - std::optional - Maximum amount of GPU memory to initially allocate in MB. * - :ref:`oversubscribe_gpus` - bool - Allow multiple processes to use each GPU. * - :ref:`teams_per_node` - uint32_t - Number of worker teams per node. * - :ref:`gpus_per_team` - std::optional - Number of GPUs per worker team. .. _max_gpu_memory_mb: max_gpu_memory_mb ^^^^^^^^^^^^^^^^^ Maximum amount of GPU memory to initially allocate per GPU in MB. If you request more memory than you have available, the code will crash. .. _oversubscribe_gpus: oversubscribe_gpus ^^^^^^^^^^^^^^^^^^ Allow multiple processes to use each GPU. Expert variable. .. _teams_per_node: teams_per_node ^^^^^^^^^^^^^^ Number of worker teams per node. .. _gpus_per_team: gpus_per_team ^^^^^^^^^^^^^ Number of GPUs per worker team. Example ^^^^^^^ .. code-block:: JSON ... "system": { "max_gpu_memory_mb": "24000", "teams_per_node": 4, "gpus_per_team": 1 } ... The above example will work in a system with 4 GPUs per node and if you're using fragmentation. It will create 4 teams, each one with one GPU, each team will receive a set of fragments and execute them concurrently. This means that you can achieve a 4 times speedup if your fragments fit in a single GPU. This is easily extended to multiple topologies, this results in the "batched mode" of EXESS where