.. _tldr:

Examples
======================


Input format
-------------

The json format that EXESS supports is partly based on the qcschema by MolSSI with some key changes. 
In the MolSSI scheme, the main group which describes the molecule is called `"molecule"` whereas 
in EXESS we call this a `"topology"`. The main input file format is designed to house more than one 
topology with the same keywords associated to it. E.g. we use `"topologies"` as an array of `"topologies"`. 

The advantage behind using a json format is the ease of information extraction from an input file 
and output file via simple Python json parsing. 

The ubiquitous format for QC calculations is XYZ. EXESS supports reading from an XYZ file format.
For example, the xyz file for water: 

.. code-block::

    3

    O -1.1570 0.0630 -0.4817
    H -1.1570 0.8180 -1.0766
    H -1.1570 -0.6920 -1.0766


Can be loaded into the "`topologies`" object like this:

.. code-block:: JSON 

  {
    "topologies": [
    {
      "xyz": "/path/to/your/water.xyz"
    }
    ]
  }

This is easily extended to multiple topologies, this results in the "batched mode" of EXESS where 
multiple topologies, in this case water_1 and water_2 will be run against the same configurations.

.. code-block:: JSON 

  {
    "topologies": [
      {
        "xyz": "/path/to/your/water_1.xyz"
      },
      {
        "xyz": "/path/to/your/water_2.xyz"
      }
    ]
  }


You can use the parley.py tool found in https://github.com/JorgeG94/parley_exess to transform 
xyz to the EXESS json format. The parley.py tool can be used as follows:

.. code-block:: BASH 

  usage: parley.py [-h] [--input_format {xyz,json}] [--output_format {json,xyz}] --input_file INPUT_FILE [--output_file OUTPUT_FILE]
                 [--basis_set BASIS_SET] [--aux_basis_set AUX_BASIS_SET] [--driver DRIVER] [--method METHOD]

  Convert between XYZ and JSON formats.

  optional arguments:
    -h, --help            show this help message and exit
    --input_format {xyz,json}
                        Input file format (default: xyz)
    --output_format {json,xyz}
                        Output file format (default: json)
    --input_file INPUT_FILE
                        Input file name
    --output_file OUTPUT_FILE
                        Output file name (default: input_file with appropriate suffix)
    --basis_set BASIS_SET
                        Basis set to use (default: 6-31G)
    --aux_basis_set AUX_BASIS_SET
                        Auxiliary basis set (default: none)
    --driver DRIVER       Driver to use (default: Energy) (options: Energy, Gradient, Dynamics, Optimization)
    --method METHOD       Method to use (default: RestrictedHF) (options: RestrictedHF and RestrictedRIMP2)

The default is going form xyz to json, the input and output format flags are not needed unless you are going 
from json to xyz. The parley tool will not check for validity of basis set choice, please use
the :ref:`supported_basis_sets` list to make sure what you're doing is allowed! When selecting Dynamics or 
Optimization parley.py will add minimal default keywords for a successfull run. 

.. _running:
How to run EXESS
----------------

This will depend on your EXESS installation and the system you are running in. For simple single node calculations, 
bundled within the EXESS PATH variables there's the executable run_exess. run_exess works with the MPI library of 
your system to execute a calculation. To run, simply use:

.. code-block:: BASH 

  module load exess
  runexess your_input_file.json -g NGPUS 

If the `-g NGPUS` is unspecified, the script will look at how many GPUs are available and use all of them. If needed use 
`runexess --help` to see a useful help message. 

Where NGPUS is the amount of GPUs you want EXESS to use. For multinode calculations your run script will look 
differently depending on what type of MPI library is available or if there's a scheduler, like slurm. Since the 
only calculations that execute on more than one node using EXESS are fragment based calculations, here's an example on how to 
submit a calculation.

.. code-block:: BASH 

  SCHEDULER=slurm 
  #NNODES 
  #NGPUS_PER_NODE 
  
  # the number of tasks per node is NTASKS_PER_NODE=NGPUS_PER_NODE + 2 
  # the total number of tasks to launch is NTASKS=NTASKS_PER_NODE * NNODES
  # for a sample of 10 nodes, and 4 GPUs per node:
  module load exess
  srun --nnodes=10 --ntasks=60 --ntasks-per-node=6 --gpus-per-node=4 exess input.json 


Now, for a harder example. For fragmentation calculations, we can allocate one fragment per GPU, or even one fragment on two GPUs. 
This is controlled via the variables in :ref:`system`, specifically: "teams_per_node" and "gpus_per_team". 

In short, one team is allocated to a single fragment. Each team can use one to NGPUS_PER_NODE. For example, teams_per_node set 1 and 
gpus_per_team set to 4 is what we have launched above. Now, if we want to use a machine with 8 GPUs per node and we want to allocate 
8 teams per node, with one GPU per team...

.. code-block:: BASH 

  SCHEDULER=slurm 
  
  NTEAMS_PER_NODE=8
  NGPUS_PER_TEAM=1

  # the number of tasks per node is (1 + (NGPUS_PER_TEAM + 1) * NTEAMS_PER_NODE)

  # for a calculation that wants to use 8 teams per node, this leads to:
  module load exess
  srun --nnodes=4092 --ntasks=69564 --ntasks-per-node=17 --gpus-per-node=8 exess input.json 


If your cluster does not use SLURM to launch calculations, you will have to rely on your `mpirun` process launcher. This is simple:

.. code-block:: BASH 

  SCHEDULER=PBS
  
  NNODES=3
  NTEAMS_PER_NODE=1
  NGPUS_PER_TEAM=4

  # the number of tasks per node is (1 + (NGPUS_PER_TEAM + 1) * NTEAMS_PER_NODE)
  #nprocs_per_node = (1 + ( 4 + 1) * 1) = 6 
  #total_nprocs = NNODES * 6 = 18
  nprocs_per_node = 6
  total_nprocs = 18
  
  module load exess
  mpirun -np ${total_nprocs} --bind-to core --map-by ppr:${nprocs_per_node}:node exess $input


Example RHF/RI-MP2 energy 
--------------------------

The most basic calculations you can perform using EXESS is either a 
Hartree-Fock energy calculations with the optional RI-MP2 correction. 
Below is a sample input file format for one water molecule to perform 
an RHF calculation using the cc-pVDZ basis set. 

.. code-block:: JSON 
  
      {
        "topologies": [
          {
            "geometry": [
              -4.3997, 1.0764, 7.7009,
              -3.9597, 0.2664, 7.3109,
              -4.4297, 0.9964, 8.7009
            ],
            "symbols": ["O", "H", "H"]
          }
        ],
        "driver": "Energy",
        "model": {
          "method": "RestrictedHF",
          "basis": "cc-pVDZ",
          "aux_basis": "cc-pVDZ-RIFIT"
        },
        "keywords": {
          "scf": {
            "max_iters": 100,
            "max_diis_history_length": 8,
            "convergence_threshold": 1e-10,
            "density_threshold": 1e-10,
            "convergence_metric": "Energy"
            }
          }
        }
      }

To request RI-MP2 on top of the RHF energy simply change the method section
to "RestrictedRIMP2" to perform the MP2 correction. Be sure to include 
the auxiliary basis set as in the example. If the calculation is not using 
an auxiliary basis set, the field will be ignored. 

.. code-block:: JSON 
  
        "model": {
          "method": "RestrictedRIMP2",
          "basis": "cc-pVDZ",
          "aux_basis": "cc-pVDZ-RIFIT"
        }


Example gradient calculation
----------------------------

To perform a gradient, simply change the "driver" to Gradient as in:

.. code-block:: JSON 
  
        "driver": "Gradient",
        "model": {
          "method": "RestrictedHF",
          "basis": "cc-pVDZ",
          "aux_basis": "cc-pVDZ-RIFIT"
        }

How to use RI-HF
----------------

EXESS is also capable of performing the RI approximation to the HF method. 
This is particularly useful for small system. See the RI-HF section for
further information. As a note, using RI-HF will incur in higher memory
use. Therefore, very large systems might not be doable with RI-HF. You 
will be fully constrained by how much memory your GPUs have, and how many 
you have in a node. An example is given below:

.. code-block:: JSON 
  
      {
        "topologies": [
          {
            "geometry": [
              -4.3997, 1.0764, 7.7009,
              -3.9597, 0.2664, 7.3109,
              -4.4297, 0.9964, 8.7009
            ],
            "symbols": ["O", "H", "H"]
          }
        ],
        "driver": "Energy",
        "model": {
          "method": "RestrictedRIHF",
          "basis": "cc-pVDZ",
          "aux_basis": "cc-pVDZ-RIFIT"
        },
        "keywords": {
          "scf": {
            "max_iters": 100,
            "fock_build_type": "RI",
            "max_diis_history_length": 8,
            "convergence_threshold": 1e-10,
            "density_threshold": 1e-10,
            "convergence_metric": "Energy"
            }
          }
        }
      }


Example MBE calculation
-----------------------

Here is a sample input for a fragmentation calculation that computes the 
HF energy of 5 water molecules, on per fragment at the dimer level. 
Below the json the fields are explained. 

.. code-block:: JSON 

      {
        "topologies": [
          {
            "geometry": [
              0.0, -1.207, 1.207,
              -0.365, -2.012, 1.575,
              0.045, -1.364, 0.264,
              0.232, 1.076, -1.308,
              0.828, 1.353, -2.003,
              -0.599, 1.509, -1.506,
              -2.274, 1.303, 0.971,
              -1.737, 1.624, 1.696,
              -1.829, 0.513, 0.664,
              -2.272, -1.224, -1.322,
              -2.136, -0.414, -1.813,
              -2.265, -0.961, -0.402,
              0.935, 1.982, 1.885,
              1.347, 1.122, 1.814,
              1.289, 2.361, 2.689
            ],
            "symbols": [
              "O",
              "H",
              "H",
              "O",
              "H",
              "H",
              "O",
              "H",
              "H",
              "O",
              "H",
              "H",
              "O",
              "H",
              "H"
            ],
            "fragments": [
              [ 0, 1, 2 ], [ 3,4,5 ],[ 6,7,8 ], [9,10,11 ], [12,13,14]
            ],
            "connectivity": [],
            "fragment_formal_charges": [0, 0, 0, 0, 0]
          }
        ],
        "model": {
          "method": "RestrictedHF",
          "basis": "STO-3G",
        },
        "keywords": {
          "scf": {
            "max_iters": 40,
            "max_diis_history_length": 12,
            "convergence_threshold": 1e-6,
            "convergence_metric": "Energy"
          },
          "frag": {
            "cutoff_type": "Centroid",
            "level": "Dimer",
            "cutoffs": {
              "dimer": 40,
              "trimer": 30
            }
          }
        },
        "driver": "Energy"
      }

Inside the "topologies" group, we now have a set of "fragments" and 
"fragment_formal_charges". The fragments group is an array of arrays, 
where each sub-array contains the 0-indexed integers corresponding to each
atom in the "symbols" group. So, the following snippet represents that 
in fragment 0 is made up of atoms 0,1,2. And the charges are an array
of size number_of_fragments that contains the formal charge of each 
fragment. 

.. code-block:: JSON 
 
      "fragments": [
        [ 0, 1, 2 ], [ 3,4,5 ],[ 6,7,8 ], [9,10,11 ], [12,13,14]
      ],
      "fragment_formal_charges": [0, 0, 0, 0, 0]

Then, the "frag" group contains relevant options for the fragmentation calculation. 
The following keywords are relevant here:

- "cutoff_type": Centroid and ClosestPair
- "level": Dimer, Trimer, Tetramer
- cutoffs: is a subgroup that determines at which distance in Angstroms a n-mer will not be calculated

.. code-block:: JSON 

          "frag": {
            "cutoff_type": "Centroid",
            "level": "Dimer",
            "cutoffs": {
              "dimer": 40,
              "trimer": 30
            }
          }

Example MBE lattice energy calculation
---------------------------------------

A lattice energy calculation also called in-house an interaction energy 
calculation is just a simplified MBE job. The difference is specifying a 
reference fragment in the input file. Then the interaction of the reference 
fragment with respect to the entire system will be calculated. To enable this simply 
add the "reference_fragment" group inside "frag" (zero-indexed).

.. code-block:: JSON 

          "frag": {
            "reference_fragment": 0,
            "cutoff_type": "Centroid",
            "level": "Dimer",
            "cutoffs": {
              "dimer": 40,
              "trimer": 30
            }
          }


Example Ab initio MD calculation
--------------------------------

EXESS has the capability of doing ab initio MD calculations (AIMD) using 
either of its gradients as the engine, i.e. RHF, RI-HF, and RI-MP2. This 
capability is available with and without fragmentation. To enable this, 
the following changes are required in your input file:

.. code-block:: JSON 

      {
        "topologies": [
          {
            "geometry": [
              0.0, -1.207, 1.207,
              -0.365, -2.012, 1.575,
              0.045, -1.364, 0.264,
              0.232, 1.076, -1.308,
              0.828, 1.353, -2.003,
              -0.599, 1.509, -1.506,
              -2.274, 1.303, 0.971,
              -1.737, 1.624, 1.696,
              -1.829, 0.513, 0.664,
              -2.272, -1.224, -1.322,
              -2.136, -0.414, -1.813,
              -2.265, -0.961, -0.402,
              0.935, 1.982, 1.885,
              1.347, 1.122, 1.814,
              1.289, 2.361, 2.689
            ],
            "symbols": [
              "O",
              "H",
              "H",
              "O",
              "H",
              "H",
              "O",
              "H",
              "H",
              "O",
              "H",
              "H",
              "O",
              "H",
              "H"
            ],
            "fragments": [
              [ 0, 1, 2 ], [ 3,4,5 ],[ 6,7,8 ], [9,10,11 ], [12,13,14]
            ],
            "connectivity": [],
            "fragment_formal_charges": [0, 0, 0, 0, 0]
          }
        ],
        "model": {
          "method": "RestrictedRIMP2",
          "basis": "cc-pVDZ",
          "aux_basis": "cc-pVDZ-RIFIT"
        },
        "keywords": {
          "scf": {
            "max_iters": 40,
            "max_diis_history_length": 12,
            "convergence_threshold": 1e-6,
            "convergence_metric": "Energy"
          },
          "frag": {
            "cutoff_type": "Centroid",
            "level": "Dimer",
            "cutoffs": {
              "dimer": 40,
              "trimer": 30
            },
            "dynamics": {
              "n_timesteps": 10,
              "dt": 0.002
            }
          }
        },
        "driver": "Dynamics"
      }


The key changes are the "driver" keyword switch to "Dynamics". And the 
addition of the "dynamics" keywords specifying the number of timesteps 
and the timestep size in FEMTOSECONDS.

.. _geo_opt_example:

Example geometry optimization calculation 
------------------------------------------

This is a simple job for a standard geometry optimization at the RI-MP2 
level of theory. The relevant values are discussed below the input file.

.. code-block:: JSON 

      {
        "topologies": [
          {
            "geometry": [
              -0.75, 0.452, -0.417, -0.696, -0.588, 0.609, 0.82, -0.678, 0.537, 0.892,
              0.417, -0.428, -1.285, 1.273, 0.066, -1.328, 0.08, -1.263, -1.225,
              -1.507, 0.366, -1.029, -0.162, 1.555, 1.158, -1.594, 0.054, 1.31,
              -0.477, 1.488, 1.432, 0.009, -1.29, 1.506, 1.236, -0.056
            ],
            "symbols": ["C", "C", "C", "C", "H", "H", "H", "H", "H", "H", "H", "H"]
          }
        ],
        "driver": "Optimization",
        "model": {
          "method": "RestrictedRIMP2",
          "basis": "cc-pVDZ",
          "aux_basis": "cc-pVDZ-RIFIT",
          "standard_orientation": "None"
        },
        "keywords": {
          "scf": {
            "fock_build_type": "RI",
            "max_iters": 100,
            "max_diis_history_length": 8,
            "convergence_threshold": 1e-10,
            "convergence_metric": "DIIS"
          },
          "optimization": {
            "max_iters": 100,
            "coordinate_system": "DelocalisedInternal"
          }
        }
      }