keywords
The keywords group contains the most important controls for the overall calculation. The keywords contained within this group are:
Keyword |
Brief |
---|---|
Controls SCF behaviour. See scf |
|
Controls fragmentation behaviour. See frag |
|
Controls guess behaviour. See guess |
|
Controls optimization behaviour. See optimization |
|
Controls AIMD behaviour. See dynamics |
|
Controls periodic boundary condition behaviour. See boundary |
|
Controls classical force fields. See ff |
|
Controls the logging of the program. See log |
|
Controls what quantities are exported to HDF5 files. See export |
|
Controls the runtime autotune of matrices. See rtat |
|
Controls debug settings. See debug |
scf
Keyword |
Type |
Brief |
---|---|---|
Double |
Max number of SCF iterations (default: 30) |
|
Int |
Max size of DIIS window (default: 8) |
|
Int |
Number of integral shell pairs per batch (default: 2560) |
|
String |
Metric to determine convergence. Options: [“DIIS”, “Energy”, “Density”] |
|
Double |
SCF convergence threshold (default: 1E-6) |
|
Double |
Threshold at which integrals are screened (default 1E-10) |
|
Bool |
Fall back to basis set projection if SCF is unconverged (default: True) |
|
String |
Select type of fock build algorithm, Options: [“HGP”, “UM09”, “RI”] |
|
Bool |
Compress the B matrix for RI-HF (default: False) |
Example SCF keywords
"scf": {
"max_iters": 40,
"max_diis_history_length": 12,
"convergence_threshold": 1e-6,
"density_threshold": 1e-10,
"density_basis_set_projection_fallback_enabled": false,
"fock_build_type": "RI",
"compress_ri_b": false,
"convergence_metric": "DIIS"
}
max_iters
This keyword controls the maximum number of SCF iterations that will be performed by the program. The default is set at 30 and can be adjusted depending on the :ref: convergence_threshold chosen.
max_diis_history_length
Use this keyword to control the size of the DIIS extrapolation space, i.e. how many past iteration matrices will be used to extrapolate the Fock matrix. The default is set to 8. A larger number will result in slightly higher memory use. This can become a problem when dealing with very large systems without fragmentation.
batch_size
Controls the number of shell pair batches stored in the shell-pair batch bin container. See 10.1021/acs.jctc.0c00768, 10.1021/acs.jctc.1c00720, and 10.1080/00268976.2022.2112987 for a more detailed explanation on the batch bin container. The default is 2560. It is best to increase or decrease this quantity in terms of multiples of 128. For example, reduce it to 1280 or increase it to 5120. Don’t go below 128 as it may interfere with some default values in the CUDA/HIP code. Block sizes of 128, for example.
convergence_metric
Choose from the following convergence metrics:
Energy
Density
DIIS
Each one returns the converged result when either the difference between the past iteration energy/density/DIIS error is below the convergence threshold specified by the user. Default is DIIS. Using energy as the convergence metric can lead to early convergence which can produce unideal orbitals for MP2 calculations.
convergence_threshold
Threshold at which the difference between SCF iterations is taken to be zero. The default value is 1E-6 which is taken to be a good threshold, specially if you are using Density or DIIS as the convergence metric. A “tight” convergence will be using 1E-8 and a very tight one is 1E-10. Anything beyond 1E-10 is highly dependent on use case. The following general advises are given:
For non-fragmented RHF + RI-MP2 calculations 1E-6 is enough for accurate results if using Density or DIIS as the convergence metric
For non-fragmented RHF + RI-MP2 calculations 1E-8 is recommended if using Energy as the convergence metric
For dimer level RHF + RI-MP2 calculations 1E-6 is adecuate to produce accurate results. Energy convergence metric is not advised.
For trimer and tetramer level RHF + RI-MP2 calculations 1E-8 is recommended for accurate results. Use DIIS as the convergence metric.
For large scale tetramer level fragmented calcualtions 1E-10 is recommended to try. Use DIIS as the convergence metric.
density_threshold
Besides the Cauchy-Schwarz screening, inside each integral kernel the integrals are further screened against the density matrix. This threshold controls at which value an integral is considered to be negligible. The default is 1E-10. Decreasing this threshold will lead to significantly faster SCF times at the possible cost of accuracy. Before settling down into decreasing this value, PLEASE test for accuracy with respect to the 1E-10 value. We are not resposible if you get wrong answers because you didn’t check.
Increasing it to 1E-11 and 1E-12 will lead to longer SCF times because more integrals will be evaluated. However, for methods such as tetramer level MBE this can better the accuracy of the program. This will also produce crisper orbitals for MP2 calculations.
density_basis_set_projection_fallback_enabled
Our longest keyword in the code. In the event of a calculation not converging under a certain combination of basis set, convergence metric, and convergence threshold the program will default to attempting the calculation again using the STO-3G basis set and project it up to the original basis set.
fock_build_type
Select the algorithm to be used for the Fock build portion of the calculation. The options are:
HGP (Head-Gordon-Pople): this algorithm is heavily optimized to dense chemical systems, i.e. where screening does not play a fundamental role. For example, RNA molecules. Counterexample, a screening heavy system: glycine chains
UM09 (Ufimitsev Martinez algorithm from 2008): This is a heavily GPU oriented algorithm based on the McMurchie Davidson integral evaluation method. It features a linear scaling K (exchange) and a heavily optimized J (Coulomb). This algorithm will shine in large systems where screening plays a key role. For example, long glycine chains
RI (resolution of the identity): change the four center two electron integrals to a set of three and two center ones greatly reducing the computational cost but dramatically increasing the memory requirements, see below for RI details.
Before using the fock_build_type of RI, please read this description fully. Using the RI approximation for the HF portion of the program has the following effects:
Integrals will be computed once and stored either on the host or the device
For small systems, RI-HF will be significantly faster than HF.
The memory use of the code will skyrocket, since the integrals are stored
You will need to specify an auxiliary basis set
Please also see the section on RI-HF performance and the article in LINK_HERE for more details on where RI-HF will be advantegeous.
compress_ri_B
Set this to true if you want the B matrix to be compressed. This might misbehave, use with caution.
frag
Keyword |
Type |
Brief |
---|---|---|
Int |
Fragmentation level. Options [1,2,3,4]. Default = 2 |
|
Int |
(Optional) Indicates reference fragment for interaction energy calculations |
|
Array[Int] |
An array of fragment ids that are to be included in the calculation |
|
Bool |
Broom broom |
|
{String,Double} |
Subgroup to specify the distance cutoffs for each fragmentation level |
|
{String} |
Distance cutoff metric (MinimalDistance or Centroid) |
Example frag keywords
"frag": {
"cutoff_type": "Centroid",
"level": "Tetramer",
"enable_speed": false,
"cutoffs": {
"dimer": 1000,
"trimer": 20,
"tetramer": 15
},
"included_fragments": [0, 1, 2, 3, 4]
},
level
Truncation level of the MBE. Options are:
Dimer n(n-1)/2 total dimers
Trimer n(n-1)(n-2)/6 total trimers
Tetramer n(n-1)(n-2)(n-3)/24 total tetramers
If not cutoffs are specified, ALL fragments will be calculated. Be cautious with how many fragments you’re calculating so you don’t unintentionally burn unnecessary compute time. The higher the approximation the more accurate energies you will get with fragmentation. See the :ref: cutoffs section for more details.
reference_fragment
To perform a lattice energy calcualtion or an interaction energy calculation you can specify a reference fragment. This will take that fragment and calculate all the polymers that contain it. The final energy will be the sum of the n-mer corrections of that reference fragment with respect to the entire system.
Example calculations for this can be used to determine the stability of a molecular complex, or the overall interaction of a molecule with the entire system. The general convention that a negative energy means binding and a positive one means repulsion holds here.
included_fragments
For non-covalently bonded fragmented calcualtions, this keyword can be used to limit the number of fragments that the calculation will evaluate. By default, all fragments in the topology will be evaluated. To specify a set of included fragments you use an array of integers corresponding to the fragment ids in the topology. For example:
"included_fragments": [0,1,2,3,4,5]
This will evaluate the first 6 fragments only and ignore the rest.
enable_speed
Enables experimental fragmentation optimization to speed up building the queue in AIMD simulations.
cutoffs
The cutoffs control at what distance a polymer won’t be calculated. They are specified by:
"level": "Tetramer",
"cutoffs": {
"dimer": 40,
"trimer": 30,
"tetramer": 15
}
All distance are in ANGSTROMS. The dimer cutoff should always be the largest. As a rule: dimer>trimer>tetramer.
cutoff_type
There are two options to measure the distance between the fragments in EXESS, their respective Keyword to select them is:
Centroid
MinimalDistance
"frag": {
"cutoff_type": "Centroid"
}
}
Centroid uses the centroid at each fragment and calculates the distance between each of them and uses it to compare against the established cutoff.
The second choice is MinimalDistance which calculates the minimal distance between each atom in the fragment with respect to the other fragments in the polymer. This is more accurate and is preferred.
guess
Keyword |
Type |
Brief |
---|---|---|
String |
Path to external initial density. |
|
bool |
Enable basis set projection to bootstrap SCF calculation by first computing in a lower resolution basis set and projecting into higher basis set. Default should be false. |
|
string |
The desired lower resolution basis set for bsp. |
|
SCFKeywords |
The base SCF keywords specified in keywords/scf SCF keywords used for the bootstrapping calculation (works with bsp). |
|
bool |
Use hcore initial guess. |
|
bool |
Superposition of monomer density optimization where nmer density guesses will be formed from the converged densities of their constituent monomers. Default should be true. |
|
bool |
Initial monomer guess based on converged densities of auto-generated subfragments (experimental feature). Default should be false. |
|
int |
Target number of atoms in subfragments produced under auto-fragmentation (works with ssfd). Default should be 30. |
|
bool |
When SSFD and basis set projection are applied on top of each other, should the subfragment densities be left unconverged in the primary basis and only projected from the bootstrap basis. Default should be true. |
|
GuessSCFKeywords |
The base SCF keywords specified in keywords/scf SCF keywords used for each subfragment calculation (works with ssfd). |
external_initial_density_path
Path to external initial density.
bsp
Enable basis set projection to bootstrap SCF calculation by first computing in a lower resolution basis set and projecting into higher basis set. Default should be false.
bsp_basis
The desired lower resolution basis set for bsp.
bsp_scf_keywords
The base SCF keywords specified in keywords/scf SCF keywords used for the bootstrapping calculation (works with bsp).
hcore
Use hcore initial guess.
smd
Superposition of monomer density optimization where nmer density guesses will be formed from the converged densities of their constituent monomers. Default should be true.
ssfd
Initial monomer guess based on converged densities of auto-generated subfragments (experimental feature). Default should be false.
ssfd_target_size
Target number of atoms in subfragments produced under auto-fragmentation (works with ssfd). Default should be 30.
ssfd_only_converge_in_bsp_basis
When SSFD and basis set projection are applied on top of each other, should the subfragment densities be left unconverged in the primary basis and only projected from the bootstrap basis. Default should be true.
ssfd_scf_keywords
The base SCF keywords specified in keywords/scf SCF keywords used for each subfragment calculation (works with ssfd).
optimization
Keyword |
Type |
Brief |
---|---|---|
size_t |
Max number of iterations. |
|
OptimizationConvergenceCriteria |
Optimizer convergence criteria: metric, gradient threshold, delta energy threshold, step component threshold. |
|
std::optional<size_t> |
Number of iterations before the optimizer is reset. |
|
CoordinateSystem |
Cartesian, NaturalInternal and DelocalisedInternal (delocalised internal is the default and is strongly recommended) |
|
HessianGuessType |
Type of hessian guess. |
|
OptimizationAlgorithmType |
Type of optimization algorithm. |
|
std::optional<TrustRegionKeywords> |
Trust region keywords. |
"optimization": {
"max_iters": 200,
"convergence_criteria": {
"metric": "Baker",
"gradient_threshold": 5.66918e-4,
"delta_energy_threshold": 1e-6,
"step_component_threshold": 1.2e-3
}
}
max_iters
Max number of iterations.
convergence_criteria
metric
“GradientOnly” means it will only consider the gradient term, “Baker” means gradient component must be within the threshold and either the step component or the delta energy component must be within their threshold (sub fields are self explanatory)
gradient_threshold
Numerical value at which the gradient is considered converged.
delta_energy_threshold
Numerical value at which the energy is considered converged.
step_component_threshold
Numerical value at which the step change between geometry optimization steps is considered converged.
optimizer_reset_interval
(expert feature) the coordinate system will be regenerated and the hessian reset every n iterations (if left blank it will never reset)
coordinate_system
The coordinate system that the geometry optimization algorithm uses to perform the calculations, the options are:
Cartesian, NaturalInternal, DelocalisedInternal
DelocalisedInternal is the default and it is heavily recommended.
hessian_guess
identity guess (not recommended), scaled identity guess (default and recommended) and schlegel and lindh model hessians (also not recommended)
algorithm
trust region augmented hessian (not recommended) or eigenvector following (recommeneded and default):
"optimization": {
"max_iters": 200,
"algorithm": "TrustRegionAugmentedHessian",
"convergence_criteria": {
"metric": "Baker",
"gradient_threshold": 5.66918e-4,
"delta_energy_threshold": 1e-6,
"step_component_threshold": 1.2e-3
}
}
trust_region_keywords
only applicable when the TRAH optimization algorithm is set, not recommended that users play with this as the defaults have been optimizer
initial_radius
Default value: 0.4
max_radius
Default value: 1e5
min_radius
Default value: 1e-5
increase_factor
Default value: 1.2
decrease_factor
Default value: 0.7
constrict_factor
Default value: 0.1
increase_threshold
Default value: 0.75
decrease_threshold
Default value: 0.25
rejection_threshold
Default value: 0.0
dynamics
Keyword |
Type |
Brief |
---|---|---|
int |
Number of timesteps to perform. |
|
double |
Timestep size, default 1fs (0.001 ps). |
|
bool |
Do asynchronous timesteps, careful expert keyword! |
"dynamics": {
"n_timesteps": 10,
"use_async_timesteps": false,
"dt": 0.002
}
n_timesteps
Number of timesteps to perform.
dt
Timestep size, default 1fs (0.001 ps).
use_async_timesteps
Do asynchronous timesteps, careful expert keyword!
boundary
Keyword |
Type |
Brief |
---|---|---|
std::optional<BoundaryCondition> |
Boundary condition for the x dimension. |
|
std::optional<BoundaryCondition> |
Boundary condition for the y dimension. |
|
std::optional<BoundaryCondition> |
Boundary condition for the z dimension. |
"boundary": {
"x": {
"kind": "Periodic",
"range": {
"lower": -2,
"upper": 3
}
},
"y": {
"kind": "Periodic",
"range": {
"lower": -2,
"upper": 3
}
},
"z": {
"kind": "Periodic",
"range": {
"lower": -2,
"upper": 3
}
}
}
x
Boundary condition for the x dimension.
y
Boundary condition for the y dimension.
z
Boundary condition for the z dimension.
ff
Keyword |
Type |
Brief |
---|---|---|
std::string |
Forcefield filename path. |
ff_filename
Forcefield filename path.
log
Keyword |
Type |
Brief |
---|---|---|
LogSettings |
Log settings for console output. |
|
std::vector<FileLogSettings> |
Log settings for file output. |
Example log keywords
"log": {
"console": {
"level": "Verbose"
},
"logfiles": [
{ "level": "Verbose",
"prefix_fmt": "[%Y-%m-%d %H:%M:%S.{us} r{rank} {level}] ",
"directory": "/tmp/exess"
}
]
}
console
Log settings for console output. The options for level of printing are (in descending order of vebosity):
Debug
Verbose
LargeInfo
Info
Performance
Warning
logfiles
Log settings for file output.
rtat
RTAT is an open source, free to use utility for performance auto tuning of matrix algebra routines. Found at https://github.com/csnowdon2/rtatblas rtat is interfaced with EXESS and if enabled, it will find the optimal configuration to run BLAS operations on the GPU.
Keyword |
Type |
Brief |
---|---|---|
bool |
Whether runtime autotuning of matrix multiplication is enabled. |
|
bool |
Use synchronous operations. |
|
std::optional<std::string> |
Prefix of file to dump RTAT JSON to. |
"rtat": {
"enabled": true,
"synchronous": true,
"json_file_dump_prefix": "prefix"
}
enabled
Whether runtime autotuning of matrix multiplication is enabled.
synchronous
Use synchronous operations.
json_file_dump_prefix
Prefix of file to dump RTAT JSON to.
debug
"debug": {
"dry_run": false,
"print_subfragment_xyz": false,
"max_fragments": 10000,
"skip_calcs": false
}
dry_run
Enable check runs, won’t do any computation but will go through all the fragments and return how many will be evaluated. Can be used to spot errors in the input file preparation. Default should be false.
print_subfragment_xyz
Print subfragment geometries as debug (for superposition_subfragment_densities). Default is false.
max_fragments
Number of fragments to compute, can be used for systems without covalent bond breaking to calculate a subset of fragments. Default is the total number of fragments.
ignore_fragments
Ignore fragmentation. This is mostly used by developers to validate that fragmented calculations match full system calculations.
skip_calcs
For debugging purposes, all computations will be skipped in a fragmentation routine. Useful for debugging or checking that the fragment queue does not take ages to run.
export
Keyword |
Type |
Brief |
---|---|---|
bool |
Export density. |
|
bool |
Export relaxed MP2 density correction. |
|
bool |
Export Fock matrix. |
|
bool |
Export overlap matrix. |
|
bool |
Export H core matrix. |
|
bool |
Export expanded density. |
|
bool |
Export expanded gradient. |
|
bool |
Export molecular orbital coefficients. |
|
bool |
Export gradient. |
|
bool |
Export Mulliken charges. |
|
bool |
Export bond orders. |
|
bool |
Export H caps. |
|
bool |
Export density descriptors. |
|
bool |
Export ESP descriptors. |
|
bool |
Export basis labels. |
|
bool |
When exporting square symmetric matrices save memory by exporting the flattened lower triangle of the matrix. Default should be true. |
|
bool |
Post-process exports into a single HDF5 output file. This is relevant for fragmented runs (particularly when configured for multinode). The concatenation of the HDF5 files may be expensive. |
|
Array[Doubles] |
Descriptor grid, can be StandardGrid, GridParams, or a vector of doubles. |
export_density
Export density.
export_relaxed_mp2_density_correction
Export relaxed MP2 density correction.
export_fock
Export Fock matrix.
export_overlap
Export overlap matrix.
export_h_core
Export H core matrix.
export_expanded_density
Export expanded density.
export_expanded_gradient
Export expanded gradient.
export_molecular_orbital_coeffs
Export molecular orbital coefficients.
export_gradient
Export gradient.
export_mulliken_charges
Export Mulliken charges.
export_bond_orders
Export bond orders.
export_h_caps
Export H caps.
export_density_descriptors
Export density descriptors.
export_esp_descriptors
Export ESP descriptors.
export_basis_labels
Export basis labels.
flatten_symmetric
When exporting square symmetric matrices save memory by exporting the flattened lower triangle of the matrix. Default should be true.
concatenate_hdf5_files
Post-process exports into a single HDF5 output file. This is relevant for fragmented runs (particularly when configured for multinode). The concatenation of the HDF5 files may be expensive.
descriptor_grid
Descriptor grid, can be StandardGrid, GridParams, or a vector of doubles.