keywords

The keywords group contains the most important controls for the overall calculation. The keywords contained within this group are:

SCF Keyword Descriptions

Keyword

Brief

scf

Controls SCF behaviour. See scf

frag

Controls fragmentation behaviour. See frag

guess

Controls guess behaviour. See guess

optimization

Controls optimization behaviour. See optimization

dynamics

Controls AIMD behaviour. See dynamics

boundary

Controls periodic boundary condition behaviour. See boundary

ff

Controls classical force fields. See ff

log

Controls the logging of the program. See log

export

Controls what quantities are exported to HDF5 files. See export

rtat

Controls the runtime autotune of matrices. See rtat

debug

Controls debug settings. See debug

scf

SCF Keyword Details

Keyword

Type

Brief

max_iters

Double

Max number of SCF iterations (default: 30)

max_diis_history_length

Int

Max size of DIIS window (default: 8)

batch_size

Int

Number of integral shell pairs per batch (default: 2560)

convergence_metric

String

Metric to determine convergence. Options: [“DIIS”, “Energy”, “Density”]

convergence_threshold

Double

SCF convergence threshold (default: 1E-6)

density_threshold

Double

Threshold at which integrals are screened (default 1E-10)

density_basis_set_projection_fallback_enabled

Bool

Fall back to basis set projection if SCF is unconverged (default: True)

fock_build_type

String

Select type of fock build algorithm, Options: [“HGP”, “UM09”, “RI”]

compress_ri_B

Bool

Compress the B matrix for RI-HF (default: False)

Example SCF keywords

"scf": {
  "max_iters": 40,
  "max_diis_history_length": 12,
  "convergence_threshold": 1e-6,
  "density_threshold": 1e-10,
  "density_basis_set_projection_fallback_enabled": false,
  "fock_build_type": "RI",
  "compress_ri_b": false,
  "convergence_metric": "DIIS"
}

max_iters

This keyword controls the maximum number of SCF iterations that will be performed by the program. The default is set at 30 and can be adjusted depending on the :ref: convergence_threshold chosen.

max_diis_history_length

Use this keyword to control the size of the DIIS extrapolation space, i.e. how many past iteration matrices will be used to extrapolate the Fock matrix. The default is set to 8. A larger number will result in slightly higher memory use. This can become a problem when dealing with very large systems without fragmentation.

batch_size

Controls the number of shell pair batches stored in the shell-pair batch bin container. See 10.1021/acs.jctc.0c00768, 10.1021/acs.jctc.1c00720, and 10.1080/00268976.2022.2112987 for a more detailed explanation on the batch bin container. The default is 2560. It is best to increase or decrease this quantity in terms of multiples of 128. For example, reduce it to 1280 or increase it to 5120. Don’t go below 128 as it may interfere with some default values in the CUDA/HIP code. Block sizes of 128, for example.

convergence_metric

Choose from the following convergence metrics:

  • Energy

  • Density

  • DIIS

Each one returns the converged result when either the difference between the past iteration energy/density/DIIS error is below the convergence threshold specified by the user. Default is DIIS. Using energy as the convergence metric can lead to early convergence which can produce unideal orbitals for MP2 calculations.

convergence_threshold

Threshold at which the difference between SCF iterations is taken to be zero. The default value is 1E-6 which is taken to be a good threshold, specially if you are using Density or DIIS as the convergence metric. A “tight” convergence will be using 1E-8 and a very tight one is 1E-10. Anything beyond 1E-10 is highly dependent on use case. The following general advises are given:

  • For non-fragmented RHF + RI-MP2 calculations 1E-6 is enough for accurate results if using Density or DIIS as the convergence metric

  • For non-fragmented RHF + RI-MP2 calculations 1E-8 is recommended if using Energy as the convergence metric

  • For dimer level RHF + RI-MP2 calculations 1E-6 is adecuate to produce accurate results. Energy convergence metric is not advised.

  • For trimer and tetramer level RHF + RI-MP2 calculations 1E-8 is recommended for accurate results. Use DIIS as the convergence metric.

  • For large scale tetramer level fragmented calcualtions 1E-10 is recommended to try. Use DIIS as the convergence metric.

density_threshold

Besides the Cauchy-Schwarz screening, inside each integral kernel the integrals are further screened against the density matrix. This threshold controls at which value an integral is considered to be negligible. The default is 1E-10. Decreasing this threshold will lead to significantly faster SCF times at the possible cost of accuracy. Before settling down into decreasing this value, PLEASE test for accuracy with respect to the 1E-10 value. We are not resposible if you get wrong answers because you didn’t check.

Increasing it to 1E-11 and 1E-12 will lead to longer SCF times because more integrals will be evaluated. However, for methods such as tetramer level MBE this can better the accuracy of the program. This will also produce crisper orbitals for MP2 calculations.

density_basis_set_projection_fallback_enabled

Our longest keyword in the code. In the event of a calculation not converging under a certain combination of basis set, convergence metric, and convergence threshold the program will default to attempting the calculation again using the STO-3G basis set and project it up to the original basis set.

fock_build_type

Select the algorithm to be used for the Fock build portion of the calculation. The options are:

  • HGP (Head-Gordon-Pople): this algorithm is heavily optimized to dense chemical systems, i.e. where screening does not play a fundamental role. For example, RNA molecules. Counterexample, a screening heavy system: glycine chains

  • UM09 (Ufimitsev Martinez algorithm from 2008): This is a heavily GPU oriented algorithm based on the McMurchie Davidson integral evaluation method. It features a linear scaling K (exchange) and a heavily optimized J (Coulomb). This algorithm will shine in large systems where screening plays a key role. For example, long glycine chains

  • RI (resolution of the identity): change the four center two electron integrals to a set of three and two center ones greatly reducing the computational cost but dramatically increasing the memory requirements, see below for RI details.

Before using the fock_build_type of RI, please read this description fully. Using the RI approximation for the HF portion of the program has the following effects:

  • Integrals will be computed once and stored either on the host or the device

  • For small systems, RI-HF will be significantly faster than HF.

  • The memory use of the code will skyrocket, since the integrals are stored

  • You will need to specify an auxiliary basis set

Please also see the section on RI-HF performance and the article in LINK_HERE for more details on where RI-HF will be advantegeous.

compress_ri_B

Set this to true if you want the B matrix to be compressed. This might misbehave, use with caution.

frag

Fragmentation keywords

Keyword

Type

Brief

level

Int

Fragmentation level. Options [1,2,3,4]. Default = 2

reference_fragment

Int

(Optional) Indicates reference fragment for interaction energy calculations

included_fragments

Array[Int]

An array of fragment ids that are to be included in the calculation

enable_speed

Bool

Broom broom

cutoffs

{String,Double}

Subgroup to specify the distance cutoffs for each fragmentation level

cutoff_type

{String}

Distance cutoff metric (MinimalDistance or Centroid)

Example frag keywords

"frag": {
  "cutoff_type": "Centroid",
  "level": "Tetramer",
  "enable_speed": false,
  "cutoffs": {
    "dimer": 1000,
    "trimer": 20,
    "tetramer": 15
  },
  "included_fragments": [0, 1, 2, 3, 4]
},

level

Truncation level of the MBE. Options are:

  • Dimer n(n-1)/2 total dimers

  • Trimer n(n-1)(n-2)/6 total trimers

  • Tetramer n(n-1)(n-2)(n-3)/24 total tetramers

If not cutoffs are specified, ALL fragments will be calculated. Be cautious with how many fragments you’re calculating so you don’t unintentionally burn unnecessary compute time. The higher the approximation the more accurate energies you will get with fragmentation. See the :ref: cutoffs section for more details.

reference_fragment

To perform a lattice energy calcualtion or an interaction energy calculation you can specify a reference fragment. This will take that fragment and calculate all the polymers that contain it. The final energy will be the sum of the n-mer corrections of that reference fragment with respect to the entire system.

Example calculations for this can be used to determine the stability of a molecular complex, or the overall interaction of a molecule with the entire system. The general convention that a negative energy means binding and a positive one means repulsion holds here.

included_fragments

For non-covalently bonded fragmented calcualtions, this keyword can be used to limit the number of fragments that the calculation will evaluate. By default, all fragments in the topology will be evaluated. To specify a set of included fragments you use an array of integers corresponding to the fragment ids in the topology. For example:

"included_fragments": [0,1,2,3,4,5]

This will evaluate the first 6 fragments only and ignore the rest.

enable_speed

Enables experimental fragmentation optimization to speed up building the queue in AIMD simulations.

cutoffs

The cutoffs control at what distance a polymer won’t be calculated. They are specified by:

"level": "Tetramer",
    "cutoffs": {
        "dimer": 40,
        "trimer": 30,
        "tetramer": 15
    }

All distance are in ANGSTROMS. The dimer cutoff should always be the largest. As a rule: dimer>trimer>tetramer.

cutoff_type

There are two options to measure the distance between the fragments in EXESS, their respective Keyword to select them is:

  • Centroid

  • MinimalDistance

"frag": {
  "cutoff_type": "Centroid"
  }
}

Centroid uses the centroid at each fragment and calculates the distance between each of them and uses it to compare against the established cutoff.

The second choice is MinimalDistance which calculates the minimal distance between each atom in the fragment with respect to the other fragments in the polymer. This is more accurate and is preferred.

guess

Guess keywords

Keyword

Type

Brief

external_initial_density_path

String

Path to external initial density.

bsp

bool

Enable basis set projection to bootstrap SCF calculation by first computing in a lower resolution basis set and projecting into higher basis set. Default should be false.

bsp_basis

string

The desired lower resolution basis set for bsp.

bsp_scf_keywords

SCFKeywords

The base SCF keywords specified in keywords/scf SCF keywords used for the bootstrapping calculation (works with bsp).

hcore

bool

Use hcore initial guess.

smd

bool

Superposition of monomer density optimization where nmer density guesses will be formed from the converged densities of their constituent monomers. Default should be true.

ssfd

bool

Initial monomer guess based on converged densities of auto-generated subfragments (experimental feature). Default should be false.

ssfd_target_size

int

Target number of atoms in subfragments produced under auto-fragmentation (works with ssfd). Default should be 30.

ssfd_only_converge_in_bsp_basis

bool

When SSFD and basis set projection are applied on top of each other, should the subfragment densities be left unconverged in the primary basis and only projected from the bootstrap basis. Default should be true.

ssfd_scf_keywords

GuessSCFKeywords

The base SCF keywords specified in keywords/scf SCF keywords used for each subfragment calculation (works with ssfd).

external_initial_density_path

Path to external initial density.

bsp

Enable basis set projection to bootstrap SCF calculation by first computing in a lower resolution basis set and projecting into higher basis set. Default should be false.

bsp_basis

The desired lower resolution basis set for bsp.

bsp_scf_keywords

The base SCF keywords specified in keywords/scf SCF keywords used for the bootstrapping calculation (works with bsp).

hcore

Use hcore initial guess.

smd

Superposition of monomer density optimization where nmer density guesses will be formed from the converged densities of their constituent monomers. Default should be true.

ssfd

Initial monomer guess based on converged densities of auto-generated subfragments (experimental feature). Default should be false.

ssfd_target_size

Target number of atoms in subfragments produced under auto-fragmentation (works with ssfd). Default should be 30.

ssfd_only_converge_in_bsp_basis

When SSFD and basis set projection are applied on top of each other, should the subfragment densities be left unconverged in the primary basis and only projected from the bootstrap basis. Default should be true.

ssfd_scf_keywords

The base SCF keywords specified in keywords/scf SCF keywords used for each subfragment calculation (works with ssfd).

optimization

Optimization Keywords

Keyword

Type

Brief

max_iters

size_t

Max number of iterations.

convergence_criteria

OptimizationConvergenceCriteria

Optimizer convergence criteria: metric, gradient threshold, delta energy threshold, step component threshold.

optimizer_reset_interval

std::optional<size_t>

Number of iterations before the optimizer is reset.

coordinate_system

CoordinateSystem

Cartesian, NaturalInternal and DelocalisedInternal (delocalised internal is the default and is strongly recommended)

hessian_guess

HessianGuessType

Type of hessian guess.

algorithm

OptimizationAlgorithmType

Type of optimization algorithm.

trust_region_keywords

std::optional<TrustRegionKeywords>

Trust region keywords.

"optimization": {
  "max_iters": 200,
  "convergence_criteria": {
    "metric": "Baker",
    "gradient_threshold": 5.66918e-4,
    "delta_energy_threshold": 1e-6,
    "step_component_threshold": 1.2e-3
  }
}

max_iters

Max number of iterations.

convergence_criteria

metric

“GradientOnly” means it will only consider the gradient term, “Baker” means gradient component must be within the threshold and either the step component or the delta energy component must be within their threshold (sub fields are self explanatory)

gradient_threshold

Numerical value at which the gradient is considered converged.

delta_energy_threshold

Numerical value at which the energy is considered converged.

step_component_threshold

Numerical value at which the step change between geometry optimization steps is considered converged.

optimizer_reset_interval

(expert feature) the coordinate system will be regenerated and the hessian reset every n iterations (if left blank it will never reset)

coordinate_system

The coordinate system that the geometry optimization algorithm uses to perform the calculations, the options are:

  • Cartesian, NaturalInternal, DelocalisedInternal

DelocalisedInternal is the default and it is heavily recommended.

hessian_guess

identity guess (not recommended), scaled identity guess (default and recommended) and schlegel and lindh model hessians (also not recommended)

algorithm

trust region augmented hessian (not recommended) or eigenvector following (recommeneded and default):

"optimization": {
  "max_iters": 200,
  "algorithm": "TrustRegionAugmentedHessian",
  "convergence_criteria": {
    "metric": "Baker",
    "gradient_threshold": 5.66918e-4,
    "delta_energy_threshold": 1e-6,
    "step_component_threshold": 1.2e-3
  }
}

trust_region_keywords

only applicable when the TRAH optimization algorithm is set, not recommended that users play with this as the defaults have been optimizer

initial_radius

Default value: 0.4

max_radius

Default value: 1e5

min_radius

Default value: 1e-5

increase_factor

Default value: 1.2

decrease_factor

Default value: 0.7

constrict_factor

Default value: 0.1

increase_threshold

Default value: 0.75

decrease_threshold

Default value: 0.25

rejection_threshold

Default value: 0.0

dynamics

Dynamics Keywords

Keyword

Type

Brief

n_timesteps

int

Number of timesteps to perform.

dt

double

Timestep size, default 1fs (0.001 ps).

use_async_timesteps

bool

Do asynchronous timesteps, careful expert keyword!

"dynamics": {
  "n_timesteps": 10,
  "use_async_timesteps": false,
  "dt": 0.002
}

n_timesteps

Number of timesteps to perform.

dt

Timestep size, default 1fs (0.001 ps).

use_async_timesteps

Do asynchronous timesteps, careful expert keyword!

boundary

Boundary Keywords

Keyword

Type

Brief

x

std::optional<BoundaryCondition>

Boundary condition for the x dimension.

y

std::optional<BoundaryCondition>

Boundary condition for the y dimension.

z

std::optional<BoundaryCondition>

Boundary condition for the z dimension.

"boundary": {
  "x": {
    "kind": "Periodic",
    "range": {
      "lower": -2,
      "upper": 3
    }
  },
  "y": {
    "kind": "Periodic",
    "range": {
      "lower": -2,
      "upper": 3
    }
  },
  "z": {
    "kind": "Periodic",
    "range": {
      "lower": -2,
      "upper": 3
    }
  }
}

x

Boundary condition for the x dimension.

y

Boundary condition for the y dimension.

z

Boundary condition for the z dimension.

ff

Force Field Keywords

Keyword

Type

Brief

ff_filename

std::string

Forcefield filename path.

ff_filename

Forcefield filename path.

log

Log Keywords

Keyword

Type

Brief

console

LogSettings

Log settings for console output.

logfiles

std::vector<FileLogSettings>

Log settings for file output.

Example log keywords

"log": {
  "console": {
    "level": "Verbose"
  },
  "logfiles": [
    { "level": "Verbose",
      "prefix_fmt":   "[%Y-%m-%d %H:%M:%S.{us} r{rank} {level}] ",
      "directory": "/tmp/exess"
    }
  ]
}

console

Log settings for console output. The options for level of printing are (in descending order of vebosity):

  • Debug

  • Verbose

  • LargeInfo

  • Info

  • Performance

  • Warning

logfiles

Log settings for file output.

rtat

RTAT is an open source, free to use utility for performance auto tuning of matrix algebra routines. Found at https://github.com/csnowdon2/rtatblas rtat is interfaced with EXESS and if enabled, it will find the optimal configuration to run BLAS operations on the GPU.

RTAT Keywords

Keyword

Type

Brief

enabled

bool

Whether runtime autotuning of matrix multiplication is enabled.

synchronous

bool

Use synchronous operations.

json_file_dump_prefix

std::optional<std::string>

Prefix of file to dump RTAT JSON to.

"rtat": {
  "enabled": true,
  "synchronous": true,
  "json_file_dump_prefix": "prefix"
}

enabled

Whether runtime autotuning of matrix multiplication is enabled.

synchronous

Use synchronous operations.

json_file_dump_prefix

Prefix of file to dump RTAT JSON to.

debug

"debug": {
  "dry_run": false,
  "print_subfragment_xyz": false,
  "max_fragments": 10000,
  "skip_calcs": false
}

dry_run

Enable check runs, won’t do any computation but will go through all the fragments and return how many will be evaluated. Can be used to spot errors in the input file preparation. Default should be false.

max_fragments

Number of fragments to compute, can be used for systems without covalent bond breaking to calculate a subset of fragments. Default is the total number of fragments.

ignore_fragments

Ignore fragmentation. This is mostly used by developers to validate that fragmented calculations match full system calculations.

skip_calcs

For debugging purposes, all computations will be skipped in a fragmentation routine. Useful for debugging or checking that the fragment queue does not take ages to run.

export

Export Keywords

Keyword

Type

Brief

export_density

bool

Export density.

export_relaxed_mp2_density_correction

bool

Export relaxed MP2 density correction.

export_fock

bool

Export Fock matrix.

export_overlap

bool

Export overlap matrix.

export_h_core

bool

Export H core matrix.

export_expanded_density

bool

Export expanded density.

export_expanded_gradient

bool

Export expanded gradient.

export_molecular_orbital_coeffs

bool

Export molecular orbital coefficients.

export_gradient

bool

Export gradient.

export_mulliken_charges

bool

Export Mulliken charges.

export_bond_orders

bool

Export bond orders.

export_h_caps

bool

Export H caps.

export_density_descriptors

bool

Export density descriptors.

export_esp_descriptors

bool

Export ESP descriptors.

export_basis_labels

bool

Export basis labels.

flatten_symmetric

bool

When exporting square symmetric matrices save memory by exporting the flattened lower triangle of the matrix. Default should be true.

concatenate_hdf5_files

bool

Post-process exports into a single HDF5 output file. This is relevant for fragmented runs (particularly when configured for multinode). The concatenation of the HDF5 files may be expensive.

descriptor_grid

Array[Doubles]

Descriptor grid, can be StandardGrid, GridParams, or a vector of doubles.

export_density

Export density.

export_relaxed_mp2_density_correction

Export relaxed MP2 density correction.

export_fock

Export Fock matrix.

export_overlap

Export overlap matrix.

export_h_core

Export H core matrix.

export_expanded_density

Export expanded density.

export_expanded_gradient

Export expanded gradient.

export_molecular_orbital_coeffs

Export molecular orbital coefficients.

export_gradient

Export gradient.

export_mulliken_charges

Export Mulliken charges.

export_bond_orders

Export bond orders.

export_h_caps

Export H caps.

export_density_descriptors

Export density descriptors.

export_esp_descriptors

Export ESP descriptors.

export_basis_labels

Export basis labels.

flatten_symmetric

When exporting square symmetric matrices save memory by exporting the flattened lower triangle of the matrix. Default should be true.

concatenate_hdf5_files

Post-process exports into a single HDF5 output file. This is relevant for fragmented runs (particularly when configured for multinode). The concatenation of the HDF5 files may be expensive.

descriptor_grid

Descriptor grid, can be StandardGrid, GridParams, or a vector of doubles.