keywords

The keywords group contains the most important controls for the overall calculation. The keywords contained within this group are:

SCF Keyword Descriptions
Keyword	Brief
scf	Controls SCF behaviour. See scf
frag	Controls fragmentation behaviour. See frag
guess	Controls guess behaviour. See guess
optimization	Controls optimization behaviour. See optimization
dynamics	Controls AIMD behaviour. See dynamics
boundary	Controls periodic boundary condition behaviour. See boundary
ff	Controls classical force fields. See ff
log	Controls the logging of the program. See log
export	Controls what quantities are exported to HDF5 files. See export
rtat	Controls the runtime autotune of matrices. See rtat
debug	Controls debug settings. See debug

scf

SCF Keyword Details
Keyword	Type	Brief
max_iters	Double	Max number of SCF iterations (default: 30)
max_diis_history_length	Int	Max size of DIIS window (default: 8)
batch_size	Int	Number of integral shell pairs per batch (default: 2560)
convergence_metric	String	Metric to determine convergence. Options: [“DIIS”, “Energy”, “Density”]
convergence_threshold	Double	SCF convergence threshold (default: 1E-6)
density_threshold	Double	Threshold at which integrals are screened (default 1E-10)
density_basis_set_projection_fallback_enabled	Bool	Fall back to basis set projection if SCF is unconverged (default: True)
fock_build_type	String	Select type of fock build algorithm, Options: [“HGP”, “UM09”, “RI”]
compress_ri_B	Bool	Compress the B matrix for RI-HF (default: False)

Example SCF keywords

"scf": {
  "max_iters": 40,
  "max_diis_history_length": 12,
  "convergence_threshold": 1e-6,
  "density_threshold": 1e-10,
  "density_basis_set_projection_fallback_enabled": false,
  "fock_build_type": "RI",
  "compress_ri_b": false,
  "convergence_metric": "DIIS"
}

max_iters

This keyword controls the maximum number of SCF iterations that will be performed by the program. The default is set at 30 and can be adjusted depending on the :ref: convergence_threshold chosen.

max_diis_history_length

Use this keyword to control the size of the DIIS extrapolation space, i.e. how many past iteration matrices will be used to extrapolate the Fock matrix. The default is set to 8. A larger number will result in slightly higher memory use. This can become a problem when dealing with very large systems without fragmentation.

batch_size

Controls the number of shell pair batches stored in the shell-pair batch bin container. See 10.1021/acs.jctc.0c00768, 10.1021/acs.jctc.1c00720, and 10.1080/00268976.2022.2112987 for a more detailed explanation on the batch bin container. The default is 2560. It is best to increase or decrease this quantity in terms of multiples of 128. For example, reduce it to 1280 or increase it to 5120. Don’t go below 128 as it may interfere with some default values in the CUDA/HIP code. Block sizes of 128, for example.

convergence_metric

Choose from the following convergence metrics:

Energy
Density
DIIS

Each one returns the converged result when either the difference between the past iteration energy/density/DIIS error is below the convergence threshold specified by the user. Default is DIIS. Using energy as the convergence metric can lead to early convergence which can produce unideal orbitals for MP2 calculations.

convergence_threshold

Threshold at which the difference between SCF iterations is taken to be zero. The default value is 1E-6 which is taken to be a good threshold, specially if you are using Density or DIIS as the convergence metric. A “tight” convergence will be using 1E-8 and a very tight one is 1E-10. Anything beyond 1E-10 is highly dependent on use case. The following general advises are given:

For non-fragmented RHF + RI-MP2 calculations 1E-6 is enough for accurate results if using Density or DIIS as the convergence metric
For non-fragmented RHF + RI-MP2 calculations 1E-8 is recommended if using Energy as the convergence metric
For dimer level RHF + RI-MP2 calculations 1E-6 is adecuate to produce accurate results. Energy convergence metric is not advised.
For trimer and tetramer level RHF + RI-MP2 calculations 1E-8 is recommended for accurate results. Use DIIS as the convergence metric.
For large scale tetramer level fragmented calcualtions 1E-10 is recommended to try. Use DIIS as the convergence metric.

density_threshold

Besides the Cauchy-Schwarz screening, inside each integral kernel the integrals are further screened against the density matrix. This threshold controls at which value an integral is considered to be negligible. The default is 1E-10. Decreasing this threshold will lead to significantly faster SCF times at the possible cost of accuracy. Before settling down into decreasing this value, PLEASE test for accuracy with respect to the 1E-10 value. We are not resposible if you get wrong answers because you didn’t check.

Increasing it to 1E-11 and 1E-12 will lead to longer SCF times because more integrals will be evaluated. However, for methods such as tetramer level MBE this can better the accuracy of the program. This will also produce crisper orbitals for MP2 calculations.

density_basis_set_projection_fallback_enabled

Our longest keyword in the code. In the event of a calculation not converging under a certain combination of basis set, convergence metric, and convergence threshold the program will default to attempting the calculation again using the STO-3G basis set and project it up to the original basis set.

fock_build_type

Select the algorithm to be used for the Fock build portion of the calculation. The options are:

HGP (Head-Gordon-Pople): this algorithm is heavily optimized to dense chemical systems, i.e. where screening does not play a fundamental role. For example, RNA molecules. Counterexample, a screening heavy system: glycine chains
UM09 (Ufimitsev Martinez algorithm from 2008): This is a heavily GPU oriented algorithm based on the McMurchie Davidson integral evaluation method. It features a linear scaling K (exchange) and a heavily optimized J (Coulomb). This algorithm will shine in large systems where screening plays a key role. For example, long glycine chains
RI (resolution of the identity): change the four center two electron integrals to a set of three and two center ones greatly reducing the computational cost but dramatically increasing the memory requirements, see below for RI details.

Before using the fock_build_type of RI, please read this description fully. Using the RI approximation for the HF portion of the program has the following effects:

Integrals will be computed once and stored either on the host or the device
For small systems, RI-HF will be significantly faster than HF.
The memory use of the code will skyrocket, since the integrals are stored
You will need to specify an auxiliary basis set

Please also see the section on RI-HF performance and the article in LINK_HERE for more details on where RI-HF will be advantegeous.

compress_ri_B

Set this to true if you want the B matrix to be compressed. This might misbehave, use with caution.

frag

Fragmentation keywords
Keyword	Type	Brief
level	Int	Fragmentation level. Options [1,2,3,4]. Default = 2
reference_fragment	Int	(Optional) Indicates reference fragment for interaction energy calculations
included_fragments	Array[Int]	An array of fragment ids that are to be included in the calculation
enable_speed	Bool	Broom broom
cutoffs	{String,Double}	Subgroup to specify the distance cutoffs for each fragmentation level
cutoff_type	{String}	Distance cutoff metric (MinimalDistance or Centroid)

Example frag keywords

"frag": {
  "cutoff_type": "Centroid",
  "level": "Tetramer",
  "enable_speed": false,
  "cutoffs": {
    "dimer": 1000,
    "trimer": 20,
    "tetramer": 15
  },
  "included_fragments": [0, 1, 2, 3, 4]
},

level

Truncation level of the MBE. Options are:

Dimer n(n-1)/2 total dimers
Trimer n(n-1)(n-2)/6 total trimers
Tetramer n(n-1)(n-2)(n-3)/24 total tetramers

If not cutoffs are specified, ALL fragments will be calculated. Be cautious with how many fragments you’re calculating so you don’t unintentionally burn unnecessary compute time. The higher the approximation the more accurate energies you will get with fragmentation. See the :ref: cutoffs section for more details.

reference_fragment

To perform a lattice energy calcualtion or an interaction energy calculation you can specify a reference fragment. This will take that fragment and calculate all the polymers that contain it. The final energy will be the sum of the n-mer corrections of that reference fragment with respect to the entire system.

Example calculations for this can be used to determine the stability of a molecular complex, or the overall interaction of a molecule with the entire system. The general convention that a negative energy means binding and a positive one means repulsion holds here.

included_fragments

For non-covalently bonded fragmented calcualtions, this keyword can be used to limit the number of fragments that the calculation will evaluate. By default, all fragments in the topology will be evaluated. To specify a set of included fragments you use an array of integers corresponding to the fragment ids in the topology. For example:

"included_fragments": [0,1,2,3,4,5]

This will evaluate the first 6 fragments only and ignore the rest.

enable_speed

Enables experimental fragmentation optimization to speed up building the queue in AIMD simulations.

cutoffs

The cutoffs control at what distance a polymer won’t be calculated. They are specified by:

"level": "Tetramer",
    "cutoffs": {
        "dimer": 40,
        "trimer": 30,
        "tetramer": 15
    }

All distance are in ANGSTROMS. The dimer cutoff should always be the largest. As a rule: dimer>trimer>tetramer.

cutoff_type

There are two options to measure the distance between the fragments in EXESS, their respective Keyword to select them is:

Centroid

MinimalDistance

"frag": {
  "cutoff_type": "Centroid"
  }
}

Centroid uses the centroid at each fragment and calculates the distance between each of them and uses it to compare against the established cutoff.

The second choice is MinimalDistance which calculates the minimal distance between each atom in the fragment with respect to the other fragments in the polymer. This is more accurate and is preferred.

guess

Guess keywords
Keyword	Type	Brief
external_initial_density_path	String	Path to external initial density.
bsp	bool	Enable basis set projection to bootstrap SCF calculation by first computing in a lower resolution basis set and projecting into higher basis set. Default should be false.
bsp_basis	string	The desired lower resolution basis set for bsp.
bsp_scf_keywords	SCFKeywords	The base SCF keywords specified in keywords/scf SCF keywords used for the bootstrapping calculation (works with bsp).
hcore	bool	Use hcore initial guess.
smd	bool	Superposition of monomer density optimization where nmer density guesses will be formed from the converged densities of their constituent monomers. Default should be true.
ssfd	bool	Initial monomer guess based on converged densities of auto-generated subfragments (experimental feature). Default should be false.
ssfd_target_size	int	Target number of atoms in subfragments produced under auto-fragmentation (works with ssfd). Default should be 30.
ssfd_only_converge_in_bsp_basis	bool	When SSFD and basis set projection are applied on top of each other, should the subfragment densities be left unconverged in the primary basis and only projected from the bootstrap basis. Default should be true.
ssfd_scf_keywords	GuessSCFKeywords	The base SCF keywords specified in keywords/scf SCF keywords used for each subfragment calculation (works with ssfd).

external_initial_density_path

Path to external initial density.

bsp

Enable basis set projection to bootstrap SCF calculation by first computing in a lower resolution basis set and projecting into higher basis set. Default should be false.

bsp_basis

The desired lower resolution basis set for bsp.

bsp_scf_keywords

The base SCF keywords specified in keywords/scf SCF keywords used for the bootstrapping calculation (works with bsp).

hcore

Use hcore initial guess.

smd

Superposition of monomer density optimization where nmer density guesses will be formed from the converged densities of their constituent monomers. Default should be true.

ssfd

Initial monomer guess based on converged densities of auto-generated subfragments (experimental feature). Default should be false.

ssfd_target_size

Target number of atoms in subfragments produced under auto-fragmentation (works with ssfd). Default should be 30.

ssfd_only_converge_in_bsp_basis

When SSFD and basis set projection are applied on top of each other, should the subfragment densities be left unconverged in the primary basis and only projected from the bootstrap basis. Default should be true.

ssfd_scf_keywords

The base SCF keywords specified in keywords/scf SCF keywords used for each subfragment calculation (works with ssfd).

optimization

Optimization Keywords
Keyword	Type	Brief
max_iters	size_t	Max number of iterations.
convergence_criteria	OptimizationConvergenceCriteria	Optimizer convergence criteria: metric, gradient threshold, delta energy threshold, step component threshold.
optimizer_reset_interval	std::optional<size_t>	Number of iterations before the optimizer is reset.
coordinate_system	CoordinateSystem	Cartesian, NaturalInternal and DelocalisedInternal (delocalised internal is the default and is strongly recommended)
hessian_guess	HessianGuessType	Type of hessian guess.
algorithm	OptimizationAlgorithmType	Type of optimization algorithm.
trust_region_keywords	std::optional<TrustRegionKeywords>	Trust region keywords.

"optimization": {
  "max_iters": 200,
  "convergence_criteria": {
    "metric": "Baker",
    "gradient_threshold": 5.66918e-4,
    "delta_energy_threshold": 1e-6,
    "step_component_threshold": 1.2e-3
  }
}

max_iters

Max number of iterations.

convergence_criteria

metric

“GradientOnly” means it will only consider the gradient term, “Baker” means gradient component must be within the threshold and either the step component or the delta energy component must be within their threshold (sub fields are self explanatory)

gradient_threshold

Numerical value at which the gradient is considered converged.

delta_energy_threshold

Numerical value at which the energy is considered converged.

step_component_threshold

Numerical value at which the step change between geometry optimization steps is considered converged.

optimizer_reset_interval

(expert feature) the coordinate system will be regenerated and the hessian reset every n iterations (if left blank it will never reset)

coordinate_system

The coordinate system that the geometry optimization algorithm uses to perform the calculations, the options are:

Cartesian, NaturalInternal, DelocalisedInternal

DelocalisedInternal is the default and it is heavily recommended.

hessian_guess

identity guess (not recommended), scaled identity guess (default and recommended) and schlegel and lindh model hessians (also not recommended)

algorithm

trust region augmented hessian (not recommended) or eigenvector following (recommeneded and default):

"optimization": {
  "max_iters": 200,
  "algorithm": "TrustRegionAugmentedHessian",
  "convergence_criteria": {
    "metric": "Baker",
    "gradient_threshold": 5.66918e-4,
    "delta_energy_threshold": 1e-6,
    "step_component_threshold": 1.2e-3
  }
}

trust_region_keywords

only applicable when the TRAH optimization algorithm is set, not recommended that users play with this as the defaults have been optimizer

initial_radius

Default value: 0.4

max_radius

Default value: 1e5

min_radius

Default value: 1e-5

increase_factor

Default value: 1.2

decrease_factor

Default value: 0.7

constrict_factor

Default value: 0.1

increase_threshold

Default value: 0.75

decrease_threshold

Default value: 0.25

rejection_threshold

Default value: 0.0

dynamics

Dynamics Keywords
Keyword	Type	Brief
n_timesteps	int	Number of timesteps to perform.
dt	double	Timestep size, default 1fs (0.001 ps).
use_async_timesteps	bool	Do asynchronous timesteps, careful expert keyword!

"dynamics": {
  "n_timesteps": 10,
  "use_async_timesteps": false,
  "dt": 0.002
}

n_timesteps

Number of timesteps to perform.

dt

Timestep size, default 1fs (0.001 ps).

use_async_timesteps

Do asynchronous timesteps, careful expert keyword!

boundary

Boundary Keywords
Keyword	Type	Brief
x	std::optional<BoundaryCondition>	Boundary condition for the x dimension.
y	std::optional<BoundaryCondition>	Boundary condition for the y dimension.
z	std::optional<BoundaryCondition>	Boundary condition for the z dimension.

"boundary": {
  "x": {
    "kind": "Periodic",
    "range": {
      "lower": -2,
      "upper": 3
    }
  },
  "y": {
    "kind": "Periodic",
    "range": {
      "lower": -2,
      "upper": 3
    }
  },
  "z": {
    "kind": "Periodic",
    "range": {
      "lower": -2,
      "upper": 3
    }
  }
}

x

Boundary condition for the x dimension.

y

Boundary condition for the y dimension.

z

Boundary condition for the z dimension.

ff

Force Field Keywords
Keyword	Type	Brief
ff_filename	std::string	Forcefield filename path.

ff_filename

Forcefield filename path.

log

Log Keywords
Keyword	Type	Brief
console	LogSettings	Log settings for console output.
logfiles	std::vector<FileLogSettings>	Log settings for file output.

Example log keywords

"log": {
  "console": {
    "level": "Verbose"
  },
  "logfiles": [
    { "level": "Verbose",
      "prefix_fmt":   "[%Y-%m-%d %H:%M:%S.{us} r{rank} {level}] ",
      "directory": "/tmp/exess"
    }
  ]
}

console

Log settings for console output. The options for level of printing are (in descending order of vebosity):

Debug

Verbose

LargeInfo

Info

Performance

Warning

logfiles

Log settings for file output.

rtat

RTAT is an open source, free to use utility for performance auto tuning of matrix algebra routines. Found at https://github.com/csnowdon2/rtatblas rtat is interfaced with EXESS and if enabled, it will find the optimal configuration to run BLAS operations on the GPU.

RTAT Keywords
Keyword	Type	Brief
enabled	bool	Whether runtime autotuning of matrix multiplication is enabled.
synchronous	bool	Use synchronous operations.
json_file_dump_prefix	std::optional<std::string>	Prefix of file to dump RTAT JSON to.

"rtat": {
  "enabled": true,
  "synchronous": true,
  "json_file_dump_prefix": "prefix"
}

enabled

Whether runtime autotuning of matrix multiplication is enabled.

synchronous

Use synchronous operations.

json_file_dump_prefix

Prefix of file to dump RTAT JSON to.

debug

"debug": {
  "dry_run": false,
  "print_subfragment_xyz": false,
  "max_fragments": 10000,
  "skip_calcs": false
}

dry_run

Enable check runs, won’t do any computation but will go through all the fragments and return how many will be evaluated. Can be used to spot errors in the input file preparation. Default should be false.

print_subfragment_xyz

Print subfragment geometries as debug (for superposition_subfragment_densities). Default is false.

max_fragments

Number of fragments to compute, can be used for systems without covalent bond breaking to calculate a subset of fragments. Default is the total number of fragments.

ignore_fragments

Ignore fragmentation. This is mostly used by developers to validate that fragmented calculations match full system calculations.

skip_calcs

For debugging purposes, all computations will be skipped in a fragmentation routine. Useful for debugging or checking that the fragment queue does not take ages to run.

export

Export Keywords
Keyword	Type	Brief
export_density	bool	Export density.
export_relaxed_mp2_density_correction	bool	Export relaxed MP2 density correction.
export_fock	bool	Export Fock matrix.
export_overlap	bool	Export overlap matrix.
export_h_core	bool	Export H core matrix.
export_expanded_density	bool	Export expanded density.
export_expanded_gradient	bool	Export expanded gradient.
export_molecular_orbital_coeffs	bool	Export molecular orbital coefficients.
export_gradient	bool	Export gradient.
export_mulliken_charges	bool	Export Mulliken charges.
export_bond_orders	bool	Export bond orders.
export_h_caps	bool	Export H caps.
export_density_descriptors	bool	Export density descriptors.
export_esp_descriptors	bool	Export ESP descriptors.
export_basis_labels	bool	Export basis labels.
flatten_symmetric	bool	When exporting square symmetric matrices save memory by exporting the flattened lower triangle of the matrix. Default should be true.
concatenate_hdf5_files	bool	Post-process exports into a single HDF5 output file. This is relevant for fragmented runs (particularly when configured for multinode). The concatenation of the HDF5 files may be expensive.
descriptor_grid	Array[Doubles]	Descriptor grid, can be StandardGrid, GridParams, or a vector of doubles.

export_density

Export density.

export_relaxed_mp2_density_correction

Export relaxed MP2 density correction.

export_fock

Export Fock matrix.

export_overlap

Export overlap matrix.

export_h_core

Export H core matrix.

export_expanded_density

Export expanded density.

export_expanded_gradient

Export expanded gradient.

export_molecular_orbital_coeffs

Export molecular orbital coefficients.

export_gradient

Export gradient.

export_mulliken_charges

Export Mulliken charges.

export_bond_orders

Export bond orders.

export_h_caps

Export H caps.

export_density_descriptors

Export density descriptors.

export_esp_descriptors

Export ESP descriptors.

export_basis_labels

Export basis labels.

flatten_symmetric

When exporting square symmetric matrices save memory by exporting the flattened lower triangle of the matrix. Default should be true.

concatenate_hdf5_files

Post-process exports into a single HDF5 output file. This is relevant for fragmented runs (particularly when configured for multinode). The concatenation of the HDF5 files may be expensive.

descriptor_grid

Descriptor grid, can be StandardGrid, GridParams, or a vector of doubles.