Skip to content

Optimal Grid Refinement tool to minimize the computational effort for free energy evaluation methods that require an overlap of simulated probability densities

License

Notifications You must be signed in to change notification settings

SanderBorgmans/OGRe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ogre ogre

OGRe or "Optimal Grid Refinement tool" is a standalone tool to minimize the computational effort for free energy evaluation methods that require an overlap of simulated probability densities. It is a package for an easy and automatic generation of a grid to perform umbrella sampling simulations, and process the simulations on that grid to refine the grid parameters. In this way a converged free energy profile can be obtained.

The corresponding manuscript has been published in the Journal of Chemical Computation and Theory: https://doi.org/10.1021/acs.jctc.3c01028

INSTALLATION

OGRe requires the following packages to work:

  • cython, numpy, scipy, h5py, matplotlib, yaml
  • molmod
  • ThermoLIB

$ conda create -n ogre python=3.10 $ conda install pip $ pip install Cython numpy $ pip install molmod $ pip install scipy pyyaml matplotlib h5py $ pip install git+https://github.com/SanderBorgmans/OGRe.git

INPUT

You can easily generate the initial layer file for your system through the provided ogre_input.py script. Through specification of the kappa value for each of your CVs, the spacings between the initial grid points for each CV, and the boundaries for each CV, the layer00.txt is generated. This constitutes the first layer of your grid. This is copied to the run.txt file, containing all information for your first iteration of simulations. Aside from these CV parameters, additional hyperparameters have to be provided for the post processing, determining which points to refine, how to refine them, and the maximum number of layers. These should also be parsed as arguments to the input script, but can be tweaked later in the data.yml file that is generated by running the input script.

Simulation parameters

kappas [floats, comma separated list]
the umbrella constants for each CV in units of fes_unit/cv_units**2

spacings [floats, comma separated list]
the initial cv spacings of the first layer in cv_units

edges [floats, comma separated list]
the minimum and maximum values for the cvs in cv_units, i.e. --edges min_cv_1,max_cv_1,min_cv_2,max_cv_2 ...

cv_units [strings, comma separated list]
[default = 1] the units in which you specify your collective variables, with respect to the units from your MD code following the molmod unit conventions (defined wrt atomic units)
e.g. for an MD code that uses atomic units, you can simply use the name of the unit you want for your cv(s), --cv_units angstrom

fes_unit [string]
[default = kjmol] unit to express energies, with respect to the units from your MD and FES code following the molmod unit conventions (defined wrt atomic units)

runup [integer]
[default = 0] number of md steps to omit as equilibration time in the postprocessing

Hyperparameters

CONFINEMENT_THR [float]
minimal percentage of the simulation that should be contained in the hypervolume defined by all the surrounding grid points to be considered as non-deviating

OVERLAP_THR [float]
minimal percentage for the overlap of the histograms of two neighbouring trajectories

KAPPA_GROWTH_FACTOR [float]
factor by which the kappa value is multiplied if the trajectory is deviating

MAX_LAYERS [integer]
[default = 1] maximum number of grid layers that will be generated by the program, this in turn defines the minimal step size for each CV

MAX_KAPPA [float]
[default = None] maximum value for kappa, if the protocol would attempt to increase the kappa value for a deviating simulation above this value, the free energy for this region is simply considered too high, and further refinement of this region is halted

HISTOGRAM_BIN_WIDTHS [float]
[default = spacings/MAX_LAYERS/2] the bin width used in calculating the overlap integral

Should the final refinement not be sufficient, the MAX_LAYERS can always be adapted at a later stage in the data.yml file. It is however important to keep the HISTOGRAM_BIN_WIDTHS constant to avoid inconsistent overlap analyses.

Example usage

ogre_input.py --kappas 1.0,5.0 --spacings 0.5,0.5 --edges -10.0,10.0,0.0,5.0 --CONFINEMENT_THR 0.25 --OVERLAP_THR 0.25 --KAPPA_GROWTH_FACTOR 2 --MAX_LAYERS 2

SIMULATION

To perform the actual simulations, we allow the user to choose their own prefered MD engine with their own umbrella sampling code. Should you choose to use Yaff, you can take a look at the main branch of this repository. The final MD data should be saved in a 'trajs' folder using the .h5 file format, identifying each simulation by their identifier from the generated grid files. For instance, trajs/traj_0_0.h5 corresponds to the MD trajectory of grid 0, simulation 0, with the umbrella position and strength corresponding to the first simulation of grid00.txt, and so on for all other grid points. This .h5 file can contain all relevant data for your simulation, but should contain at least one dataset trajectory/cv_values corresponding to the values of your collective variables throughout the simulation.

POSTPROCESSING

For the evaluation of your collection of simulations, you can use the ogre_post.py script. This can evaluate the quality of each simulation by considering their deviation from the umbrella center and the overlap with simulations of neighbouring grid points. After this evaluation, it will refine the kappa values of existing grid files for deviating simulations, and generate additional grid points in regions where the overlap was insufficient.

ogre_post.py --overlap
ogre_post.py --fes

You can add the --test flag when performing the overlap analysis to avoid creating, deleting, or replacing certain files or file entries. This can be convenient when tweaking the hyperparameters.

About

Optimal Grid Refinement tool to minimize the computational effort for free energy evaluation methods that require an overlap of simulated probability densities

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages