Welcome to the BLUES documentation!

Introduction

_images/blues.png

BLUES is a python package that takes advantage of non-equilibrium candidate Monte Carlo moves (NCMC) to help sample between different ligand binding modes.

Github

Github Check out our Github repository.

Issue If you have any problems or suggestions through our issue tracker.

Fork To contribute to our code, please fork our repository and open a Pull Request.

https://travis-ci.org/MobleyLab/blues.svg?branch=master Documentation Status https://codecov.io/gh/MobleyLab/blues/branch/master/graph/badge.svg https://anaconda.org/mobleylab/blues/badges/version.svg https://zenodo.org/badge/62096511.svg

Publication

_images/jp-2017-11820n_0015.gif

Binding Modes of Ligands Using Enhanced Sampling (BLUES): Rapid Decorrelation of Ligand Binding Modes via Nonequilibrium Candidate Monte Carlo

Samuel C. Gill, Nathan M. Lim, Patrick B. Grinaway, Ariën S. Rustenburg, Josh Fass, Gregory A. Ross, John D. Chodera , and David L. Mobley

Journal of Physical Chemistry B 2018 122 (21), 5579-5598

DOI: 10.1021/acs.jpcb.7b11820

Publication Date (Web): February 27, 2018

Abstract

Accurately predicting protein–ligand binding affinities and binding modes is a major goal in computational chemistry, but even the prediction of ligand binding modes in proteins poses major challenges. Here, we focus on solving the binding mode prediction problem for rigid fragments. That is, we focus on computing the dominant placement, conformation, and orientations of a relatively rigid, fragment-like ligand in a receptor, and the populations of the multiple binding modes which may be relevant. This problem is important in its own right, but is even more timely given the recent success of alchemical free energy calculations. Alchemical calculations are increasingly used to predict binding free energies of ligands to receptors. However, the accuracy of these calculations is dependent on proper sampling of the relevant ligand binding modes. Unfortunately, ligand binding modes may often be uncertain, hard to predict, and/or slow to interconvert on simulation time scales, so proper sampling with current techniques can require prohibitively long simulations. We need new methods which dramatically improve sampling of ligand binding modes. Here, we develop and apply a nonequilibrium candidate Monte Carlo (NCMC) method to improve sampling of ligand binding modes. In this technique, the ligand is rotated and subsequently allowed to relax in its new position through alchemical perturbation before accepting or rejecting the rotation and relaxation as a nonequilibrium Monte Carlo move. When applied to a T4 lysozyme model binding system, this NCMC method shows over 2 orders of magnitude improvement in binding mode sampling efficiency compared to a brute force molecular dynamics simulation. This is a first step toward applying this methodology to pharmaceutically relevant binding of fragments and, eventually, drug-like molecules. We are making this approach available via our new Binding modes of ligands using enhanced sampling (BLUES) package which is freely available on GitHub.

Citations

_images/ct-2018-01018f_0014.jpeg

Enhancing Side Chain Rotamer Sampling Using Nonequilibrium Candidate Monte Carlo

Kalistyn H. Burley, Samuel C. Gill, Nathan M. Lim, and David L. Mobley

Journal of Chemical Theory and Computation. 2019 15 (3), 1848-1862

DOI: 10.1021/acs.jctc.8b01018 <https://pubs.acs.org/doi/abs/10.1021/acs.jctc.8b01018>

Publication Date (Web): January 24, 2019

Abstract

Molecular simulations are a valuable tool for studying biomolecular motions and thermodynamics. However, such motions can be slow compared to simulation time scales, yet critical. Specifically, adequate sampling of side chain motions in protein binding pockets is crucial for obtaining accurate estimates of ligand binding free energies from molecular simulations. The time scale of side chain rotamer flips can range from a few ps to several hundred ns or longer, particularly in crowded environments like the interior of proteins. Here, we apply a mixed nonequilibrium candidate Monte Carlo (NCMC)/molecular dynamics (MD) method to enhance sampling of side chain rotamers. The NCMC portion of our method applies a switching protocol wherein the steric and electrostatic interactions between target side chain atoms and the surrounding environment are cycled off and then back on during the course of a move proposal. Between NCMC move proposals, simulation of the system continues via traditional molecular dynamics. Here, we first validate this approach on a simple, solvated valine-alanine dipeptide system and then apply it to a well-studied model ligand binding site in T4 lysozyme L99A. We compute the rate of rotamer transitions for a valine side chain using our approach and compare it to that of traditional molecular dynamics simulations. Here, we show that our NCMC/MD method substantially enhances side chain sampling, especially in systems where the torsional barrier to rotation is high (≥10 kcal/mol). These barriers can be intrinsic torsional barriers or steric barriers imposed by the environment. Overall, this may provide a promising strategy to selectively improve side chain sampling in molecular simulations.

Installation

BLUES is compatible with MacOSX/Linux with Python>=3.6 (blues<1.1 still works with Python 2.7)

This is a python tool kit with a few dependencies. We recommend installing miniconda. Then you can create an environment with the following commands:

conda create -n blues python=3.6
source activate blues

Stable Releases

The recommended way to install BLUES would be to install from conda.

conda install -c mobleylab blues

Development Builds

Alternatively, you can install the latest development build. Development builds contain the latest commits/PRs not yet issued in a point release.

conda install -c mobleylab/label/dev blues

In order to use the SideChainMove class you will need OpenEye Toolkits and some related tools.

conda install -c openeye/label/Orion -c omnia oeommtools packmol
conda install -c openeye openeye-toolkits

Source Installation

Although we do NOT recommend it, you can also install directly from the source code.

git clone https://github.com/MobleyLab/blues.git
conda install -c omnia -c conda-forge openmmtools openmm numpy cython
pip install -e .

To validate your BLUES installation run the tests.

pip instal -e .[tests]
pytest -v -s

Usage

This package takes advantage of non-equilibrium candidate Monte Carlo moves (NCMC) to help sample between different ligand binding modes using the OpenMM simulation package. One goal for this package is to allow for easy additions of other moves of interest, which will be covered below.

The integrator from BLUES contains the framework necessary for NCMC. Specifically, the integrator class calculates the work done during a NCMC move. It also controls the lambda scaling of parameters. The integrator that BLUES uses inherits from openmmtools.integrators.AlchemicalExternalLangevinIntegrator to keep track of the work done outside integration steps, allowing Monte Carlo (MC) moves to be incorporated together with the NCMC thermodynamic perturbation protocol. Currently, the openmmtools.alchemy package is used to generate the lambda parameters for the ligand, allowing alchemical modification of the sterics and electrostatics of the system.

The BLUESSampler class in ncmc.py serves as a wrapper for running NCMC+MD simulations. To run the hybrid simulation, the BLUESSampler class requires defining two moves for running the (1) MD simulation and (2) the NCMC protcol. These moves are defined in the ncmc.py module. A simple example is provided below.

Example

Using the BLUES framework requires the use of a ThermodynamicState and SamplerState from openmmtools which we import from openmmtools.states:

from openmmtools.states import ThermodynamicState, SamplerState
from openmmtools.testsystems import TolueneVacuum
from blues.ncmc import *
from simtk import unit

Create the states for a toluene molecule in vacuum.

tol = TolueneVacuum()
thermodynamic_state = ThermodynamicState(tol.system, temperature=300*unit.kelvin)
sampler_state = SamplerState(positions=tol.positions)

Define our langevin dynamics move for the MD simulation portion and then our NCMC move which performs a random rotation. Here, we use a customized openmmtools.mcmc.LangevinDynamicsMove which allows us to store information from the MD simulation portion.

dynamics_move = ReportLangevinDynamicsMove(n_steps=10)
ncmc_move = RandomLigandRotationMove(n_steps=10, atom_subset=list(range(15)))

Provide the BLUESSampler class with an openmm.Topology and these objects to run the NCMC+MD simulation.

sampler = BLUESSampler(thermodynamic_state=thermodynamic_state,
                      sampler_state=sampler_state,
                      dynamics_move=dynamics_move,
                      ncmc_move=ncmc_move,
                      topology=tol.topology)
sampler.run(n_iterations=1)

Let us know if you have any problems or suggestions through our Issue tracker.

Modules

Nonequilibrium Candidate Monte Carlo (NCMC)

Provides moves and classes for running the BLUES simulation.

ReportLangevinDynamicsMove

class blues.ncmc.ReportLangevinDynamicsMove(n_steps=1000, timestep=Quantity(value=2.0, unit=femtosecond), collision_rate=Quantity(value=1.0, unit=/picosecond), reassign_velocities=True, context_cache=None, reporters=[])[source]

Langevin dynamics segment as a (pseudo) Monte Carlo move.

This move class allows the attachment of a reporter for storing the data from running this segment of dynamics. This move assigns a velocity from the Maxwell-Boltzmann distribution and executes a number of Maxwell-Boltzmann steps to propagate dynamics. This is not a true Monte Carlo move, in that the generation of the correct distribution is only exact in the limit of infinitely small timestep; in other words, the discretization error is assumed to be negligible. Use HybridMonteCarloMove instead to ensure the exact distribution is generated.

Warning

No Metropolization is used to ensure the correct phase space distribution is sampled. This means that timestep-dependent errors will remain uncorrected, and are amplified with larger timesteps. Use this move at your own risk!

Parameters
  • n_steps (int, optional) – The number of integration timesteps to take each time the move is applied (default is 1000).

  • timestep (simtk.unit.Quantity, optional) – The timestep to use for Langevin integration (time units, default is 1*simtk.unit.femtosecond).

  • collision_rate (simtk.unit.Quantity, optional) – The collision rate with fictitious bath particles (1/time units, default is 10/simtk.unit.picoseconds).

  • reassign_velocities (bool, optional) – If True, the velocities will be reassigned from the Maxwell-Boltzmann distribution at the beginning of the move (default is False).

  • context_cache (openmmtools.cache.ContextCache, optional) – The ContextCache to use for Context creation. If None, the global cache openmmtools.cache.global_context_cache is used (default is None).

  • reporters (list) – A list of the storage classes inteded for reporting the simulation data. This can be either blues.storage.(NetCDF4Storage/BLUESStateDataStorage).

n_steps

The number of integration timesteps to take each time the move is applied.

Type

int

timestep

The timestep to use for Langevin integration (time units).

Type

simtk.unit.Quantity

collision_rate

The collision rate with fictitious bath particles (1/time units).

Type

simtk.unit.Quantity

reassign_velocities

If True, the velocities will be reassigned from the Maxwell-Boltzmann distribution at the beginning of the move.

Type

bool

context_cache

The ContextCache to use for Context creation. If None, the global cache openmmtools.cache.global_context_cache is used.

Type

openmmtools.cache.ContextCache

reporters

A list of the storage classes inteded for reporting the simulation data. This can be either blues.storage.(NetCDF4Storage/BLUESStateDataStorage).

Type

list

Examples

First we need to create the thermodynamic state and the sampler state to propagate. Here we create an alanine dipeptide system in vacuum.

>>> from simtk import unit
>>> from openmmtools import testsystems
>>> from openmmtools.states import SamplerState, ThermodynamicState
>>> test = testsystems.AlanineDipeptideVacuum()
>>> sampler_state = SamplerState(positions=test.positions)
>>> thermodynamic_state = ThermodynamicState(system=test.system, temperature=298*unit.kelvin)

Create reporters for storing our simulation data.

>>> from blues.storage import NetCDF4Storage, BLUESStateDataStorage
nc_storage = NetCDF4Storage('test-md.nc',
                            reportInterval=5,
                            crds=True, vels=True, frcs=True)
state_storage = BLUESStateDataStorage('test.log',
                                      reportInterval=5,
                                      step=True, time=True,
                                      potentialEnergy=True,
                                      kineticEnergy=True,
                                      totalEnergy=True,
                                      temperature=True,
                                      volume=True,
                                      density=True,
                                      progress=True,
                                      remainingTime=True,
                                      speed=True,
                                      elapsedTime=True,
                                      systemMass=True,
                                      totalSteps=10)

Create a Langevin move with default parameters

>>> move = ReportLangevinDynamicsMove()

or create a Langevin move with specified parameters.

>>> move = ReportLangevinDynamicsMove(timestep=0.5*unit.femtoseconds,
                                      collision_rate=20.0/unit.picoseconds, n_steps=10,
                                      reporters=[nc_storage, state_storage])

Perform one update of the sampler state. The sampler state is updated with the new state.

>>> move.apply(thermodynamic_state, sampler_state)
>>> np.allclose(sampler_state.positions, test.positions)
False

The same move can be applied to a different state, here an ideal gas.

>>> test = testsystems.IdealGas()
>>> sampler_state = SamplerState(positions=test.positions)
>>> thermodynamic_state = ThermodynamicState(system=test.system,
...                                          temperature=298*unit.kelvin)
>>> move.apply(thermodynamic_state, sampler_state)
>>> np.allclose(sampler_state.positions, test.positions)
False
apply(thermodynamic_state, sampler_state)[source]

Propagate the state through the integrator.

This updates the SamplerState after the integration.

Parameters
  • thermodynamic_state (openmmtools.states.ThermodynamicState) – The thermodynamic state to use to propagate dynamics.

  • sampler_state (openmmtools.states.SamplerState) – The sampler state to apply the move to. This is modified.

NCMCMove

class blues.ncmc.NCMCMove(n_steps=1000, timestep=Quantity(value=2.0, unit=femtosecond), atom_subset=None, context_cache=None, nprop=1, propLambda=0.3, reporters=[])[source]

A general NCMC move that applies an alchemical integrator.

This class is intended to be inherited by NCMCMoves that need to alchemically modify and perturb part of the system. The child class has to implement the _propose_positions method. Reporters can be attached to report data from the NCMC part of the simulation.

You can decide to override _before_integration() and _after_integration() to execute some code at specific points of the workflow, for example to read data from the Context before the it is destroyed.

Parameters
  • n_steps (int, optional) – The number of integration timesteps to take each time the move is applied (default is 1000).

  • timestep (simtk.unit.Quantity, optional) – The timestep to use for Langevin integration (time units, default is 1*simtk.unit.femtosecond).

  • atom_subset (slice or list of int, optional) – If specified, the move is applied only to those atoms specified by these indices. If None, the move is applied to all atoms (default is None).

  • context_cache (openmmtools.cache.ContextCache, optional) – The ContextCache to use for Context creation. If None, the global cache openmmtools.cache.global_context_cache is used (default is None).

  • reporters (list) – A list of the storage classes inteded for reporting the simulation data. This can be either blues.storage.(NetCDF4Storage/BLUESStateDataStorage).

n_steps

The number of integration timesteps to take each time the move is applied.

Type

int

timestep

The timestep to use for Langevin integration (time units).

Type

simtk.unit.Quantity

atom_subset

If specified, the move is applied only to those atoms specified by these indices. If None, the move is applied to all atoms (default is None).

Type

slice or list of int, optional

context_cache

The ContextCache to use for Context creation. If None, the global cache openmmtools.cache.global_context_cache is used.

Type

openmmtools.cache.ContextCache

reporters

A list of the storage classes inteded for reporting the simulation data. This can be either blues.storage.(NetCDF4Storage/BLUESStateDataStorage).

Type

list

property statistics

Statistics as a dictionary.

apply(thermodynamic_state, sampler_state)[source]

Apply a move to the sampler state.

Parameters
  • thermodynamic_state (openmmtools.states.ThermodynamicState) – The thermodynamic state to use to apply the move.

  • sampler_state (openmmtools.states.SamplerState) – The initial sampler state to apply the move to. This is modified.

RandomLigandRotationMove

class blues.ncmc.RandomLigandRotationMove(n_steps=1000, timestep=Quantity(value=2.0, unit=femtosecond), atom_subset=None, context_cache=None, nprop=1, propLambda=0.3, reporters=[])[source]

An NCMC move which proposes random rotations.

This class will propose a random rotation (as a rigid body) using the center of mass of the selected atoms. This class does not metropolize the proposed moves. Reporters can be attached to record the ncmc simulation data, mostly useful for debugging by storing coordinates of the proposed moves or monitoring the ncmc simulation progression by attaching a state reporter.

Parameters
  • n_steps (int, optional) – The number of integration timesteps to take each time the move is applied (default is 1000).

  • timestep (simtk.unit.Quantity, optional) – The timestep to use for Langevin integration (time units, default is 1*simtk.unit.femtosecond).

  • atom_subset (slice or list of int, optional) – If specified, the move is applied only to those atoms specified by these indices. If None, the move is applied to all atoms (default is None).

  • context_cache (openmmtools.cache.ContextCache, optional) – The ContextCache to use for Context creation. If None, the global cache openmmtools.cache.global_context_cache is used (default is None).

  • reporters (list) – A list of the storage classes inteded for reporting the simulation data. This can be either blues.storage.(NetCDF4Storage/BLUESStateDataStorage).

n_steps

The number of integration timesteps to take each time the move is applied.

Type

int

timestep

The timestep to use for Langevin integration (time units).

Type

simtk.unit.Quantity

atom_subset

If specified, the move is applied only to those atoms specified by these indices. If None, the move is applied to all atoms (default is None).

Type

slice or list of int, optional

context_cache

The ContextCache to use for Context creation. If None, the global cache openmmtools.cache.global_context_cache is used.

Type

openmmtools.cache.ContextCache

reporters

A list of the storage classes inteded for reporting the simulation data. This can be either blues.storage.(NetCDF4Storage/BLUESStateDataStorage).

Type

list

Examples

First we need to create the thermodynamic state, alchemical thermodynamic state, and the sampler state to propagate. Here we create a toy system of a charged ethylene molecule in between two charged particles.

>>> from simtk import unit
>>> from openmmtools import testsystems, alchemy
>>> from openmmtools.states import SamplerState, ThermodynamicState
>>> from blues.systemfactories import generateAlchSystem
>>> from blues import utils
>>> structure_pdb = utils.get_data_filename('blues', 'tests/data/ethylene_structure.pdb')
>>> structure = parmed.load_file(structure_pdb)
>>> system_xml = utils.get_data_filename('blues', 'tests/data/ethylene_system.xml')
    with open(system_xml, 'r') as infile:
        xml = infile.read()
        system = openmm.XmlSerializer.deserialize(xml)
>>> thermodynamic_state = ThermodynamicState(system=system, temperature=200*unit.kelvin)
>>> sampler_state = SamplerState(positions=structure.positions.in_units_of(unit.nanometers))
>>> alchemical_atoms = [2, 3, 4, 5, 6, 7]
>>> alch_system = generateAlchSystem(thermodynamic_state.get_system(), alchemical_atoms)
>>> alch_state = alchemy.AlchemicalState.from_system(alch_system)
>>> alch_thermodynamic_state = ThermodynamicState(
        alch_system, thermodynamic_state.temperature)
>>> alch_thermodynamic_state = CompoundThermodynamicState(
        alch_thermodynamic_state, composable_states=[alch_state])

Create reporters for storing our ncmc simulation data.

>>> from blues.storage import NetCDF4Storage, BLUESStateDataStorage
nc_storage = NetCDF4Storage('test-ncmc.nc',
                            reportInterval=5,
                            crds=True, vels=True, frcs=True,
                            protocolWork=True, alchemicalLambda=True)
state_storage = BLUESStateDataStorage('test-ncmc.log',
                                      reportInterval=5,
                                      step=True, time=True,
                                      potentialEnergy=True,
                                      kineticEnergy=True,
                                      totalEnergy=True,
                                      temperature=True,
                                      volume=True,
                                      density=True,
                                      progress=True,
                                      remainingTime=True,
                                      speed=True,
                                      elapsedTime=True,
                                      systemMass=True,
                                      totalSteps=10,
                                      protocolWork=True,
                                      alchemicalLambda=True)

Create a RandomLigandRotationMove move

>>> rot_move = RandomLigandRotationMove(n_steps=5,
                                        timestep=1*unit.femtoseconds,
                                        atom_subset=alchemical_atoms,
                                        reporters=[nc_storage, state_storage])

Perform one update of the sampler state. The sampler state is updated with the new state.

>>> move.apply(thermodynamic_state, sampler_state)
>>> np.allclose(sampler_state.positions, structure.positions)
False

BLUESSampler

class blues.ncmc.BLUESSampler(thermodynamic_state=None, alch_thermodynamic_state=None, sampler_state=None, dynamics_move=None, ncmc_move=None, topology=None)[source]

BLUESSampler runs the NCMC+MD hybrid simulation.

This class ties together the two moves classes to execute the NCMC+MD hybrid simulation. One move class is intended to carry out traditional MD and the other is intended carry out the NCMC move proposals which performs the alchemical transformation to given atom subset. This class handles proper metropolization of the NCMC move proposals, while correcting for the switch in integrators.

equil(n_iterations=1)[source]

Equilibrate the system for N iterations.

run(n_iterations=1)[source]

Run the sampler for the specified number of iterations.

descriptive summary here

Parameters

niterations (int, optional, default=1) – Number of iterations to run the sampler for.

SystemFactory

SystemFactory contains methods to generate/modify the OpenMM System object.

blues.systemfactory.generateAlchSystem(system, atom_indices, softcore_alpha=0.5, softcore_a=1, softcore_b=1, softcore_c=6, softcore_beta=0.0, softcore_d=1, softcore_e=1, softcore_f=2, annihilate_electrostatics=True, annihilate_sterics=False, disable_alchemical_dispersion_correction=True, alchemical_pme_treatment='direct-space', suppress_warnings=True, **kwargs)[source]

Return the OpenMM System for alchemical perturbations.

This function calls openmmtools.alchemy.AbsoluteAlchemicalFactory and openmmtools.alchemy.AlchemicalRegion to generate the System for the NCMC simulation.

Parameters
  • system (openmm.System) – The OpenMM System object corresponding to the reference system.

  • atom_indices (list of int) – Atom indicies of the move or designated for which the nonbonded forces (both sterics and electrostatics components) have to be alchemically modified.

  • annihilate_electrostatics (bool, optional) – If True, electrostatics should be annihilated, rather than decoupled (default is True).

  • annihilate_sterics (bool, optional) – If True, sterics (Lennard-Jones or Halgren potential) will be annihilated, rather than decoupled (default is False).

  • softcore_alpha (float, optional) – Alchemical softcore parameter for Lennard-Jones (default is 0.5).

  • softcore_a, softcore_b, softcore_c (float, optional) – Parameters modifying softcore Lennard-Jones form. Introduced in Eq. 13 of Ref. [TTPham-JChemPhys135-2011] (default is 1).

  • softcore_beta (float, optional) – Alchemical softcore parameter for electrostatics. Set this to zero to recover standard electrostatic scaling (default is 0.0).

  • softcore_d, softcore_e, softcore_f (float, optional) – Parameters modifying softcore electrostatics form (default is 1).

  • disable_alchemical_dispersion_correction (bool, optional, default=True) – If True, the long-range dispersion correction will not be included for the alchemical region to avoid the need to recompute the correction (a CPU operation that takes ~ 0.5 s) every time ‘lambda_sterics’ is changed. If using nonequilibrium protocols, it is recommended that this be set to True since this can lead to enormous (100x) slowdowns if the correction must be recomputed every time step.

  • alchemical_pme_treatment (str, optional, default = ‘direct-space’) – Controls how alchemical region electrostatics are treated when PME is used. Options are ‘direct-space’, ‘coulomb’, ‘exact’. - ‘direct-space’ only models the direct space contribution - ‘coulomb’ includes switched Coulomb interaction - ‘exact’ includes also the reciprocal space contribution, but it’s only possible to annihilate the charges and the softcore parameters controlling the electrostatics are deactivated. Also, with this method, modifying the global variable lambda_electrostatics is not sufficient to control the charges. The recommended way to change them is through the AlchemicalState class.

Returns

alch_system (alchemical_system) – System to be used for the NCMC simulation.

References

TTPham-JChemPhys135-2011
    1. Pham and M. R. Shirts,

  1. Chem. Phys 135, 034114 (2011). http://dx.doi.org/10.1063/1.3607597

blues.systemfactory.zero_masses(system, atomList=None)[source]

Zeroes the masses of specified atoms to constrain certain degrees of freedom.

Parameters
  • system (openmm.System) – system to zero masses

  • atomList (list of ints) – atom indicies to zero masses

Returns

system (openmm.System) – The modified system with massless atoms.

blues.systemfactory.restrain_positions(structure, system, selection='(@CA, C, N)', weight=5.0, **kwargs)[source]

Apply positional restraints to atoms in the openmm.System by the given parmed selection [amber-syntax].

Parameters
  • system (openmm.System) – The OpenMM System object to be modified.

  • structure (parmed.Structure()) – Structure of the system, used for atom selection.

  • selection (str, Default = “(@CA,C,N)”) – AmberMask selection to apply positional restraints to

  • weight (float, Default = 5.0) – Restraint weight for xyz atom restraints in kcal/(mol A^2)

Returns

system (openmm.System) – Modified with positional restraints applied.

blues.systemfactory.freeze_atoms(structure, system, freeze_selection=':LIG', **kwargs)[source]

Zero the masses of atoms from the given parmed selection [amber-syntax].

Massless atoms will be ignored by the integrator and will not change positions.

Parameters
  • system (openmm.System) – The OpenMM System object to be modified.

  • structure (parmed.Structure()) – Structure of the system, used for atom selection.

  • freeze_selection (str, Default = “:LIG”) – AmberMask selection for the center in which to select atoms for zeroing their masses. Defaults to freezing protein backbone atoms.

Returns

system (openmm.System) – The modified system with the selected atoms

blues.systemfactory.freeze_radius(structure, system, freeze_distance=Quantity(value=5.0, unit=angstrom), freeze_center=':LIG', freeze_solvent=':HOH, NA, CL', **kwargs)[source]

Zero the masses of atoms outside the given raidus of the freeze_center parmed selection [amber-syntax].

Massless atoms will be ignored by the integrator and will not change positions. This is intended to freeze the solvent and protein atoms around the ligand binding site.

Parameters
  • system (openmm.System) – The OpenMM System object to be modified.

  • structure (parmed.Structure()) – Structure of the system, used for atom selection.

  • freeze_distance (float, Default = 5.0) – Distance (angstroms) to select atoms for retaining their masses. Atoms outside the set distance will have their masses set to 0.0.

  • freeze_center (str, Default = “:LIG”) – AmberMask selection for the center in which to select atoms for zeroing their masses. Default: LIG

  • freeze_solvent (str, Default = “:HOH,NA,CL”) – AmberMask selection in which to select solvent atoms for zeroing their masses.

Returns

system (openmm.System) – Modified system with masses outside the freeze center zeroed.

blues.systemfactory.addBarostat(system, temperature=Quantity(value=300, unit=kelvin), pressure=Quantity(value=1, unit=atmosphere), frequency=25, **kwargs)[source]

Add a MonteCarloBarostat to the MD system.

Parameters
  • system (openmm.System) – The OpenMM System object corresponding to the reference system.

  • temperature (float, default=300) – temperature (Kelvin) to be simulated at.

  • pressure (int, configional, default=None) – Pressure (atm) for Barostat for NPT simulations.

  • frequency (int, default=25) – Frequency at which Monte Carlo pressure changes should be attempted (in time steps)

Returns

system (openmm.System) – The OpenMM System with the MonteCarloBarostat attached.

Integrators

class blues.integrators.AlchemicalExternalLangevinIntegrator(alchemical_functions, splitting='R V O H O V R', temperature=Quantity(value=298.0, unit=kelvin), collision_rate=Quantity(value=1.0, unit=/picosecond), timestep=Quantity(value=1.0, unit=femtosecond), constraint_tolerance=1e-08, measure_shadow_work=False, measure_heat=True, nsteps_neq=100, nprop=1, propLambda=0.3, *args, **kwargs)[source]

Allows nonequilibrium switching based on force parameters specified in alchemical_functions. A variable named lambda is switched from 0 to 1 linearly throughout the nsteps of the protocol. The functions can use this to create more complex protocols for other global parameters.

As opposed to openmmtools.integrators.AlchemicalNonequilibriumLangevinIntegrator, which this inherits from, the AlchemicalExternalLangevinIntegrator integrator also takes into account work done outside the nonequilibrium switching portion(between integration steps). For example if a molecule is rotated between integration steps, this integrator would correctly account for the work caused by that rotation.

Propagator is based on Langevin splitting, as described below. One way to divide the Langevin system is into three parts which can each be solved “exactly:”

  • R: Linear “drift” / Constrained “drift”

    Deterministic update of positions, using current velocities x <- x + v dt

  • V: Linear “kick” / Constrained “kick”

    Deterministic update of velocities, using current forces v <- v + (f/m) dt; where f = force, m = mass

  • O: Ornstein-Uhlenbeck

    Stochastic update of velocities, simulating interaction with a heat bath v <- av + b sqrt(kT/m) R where:

    • a = e^(-gamma dt)

    • b = sqrt(1 - e^(-2gamma dt))

    • R is i.i.d. standard normal

We can then construct integrators by solving each part for a certain timestep in sequence. (We can further split up the V step by force group, evaluating cheap but fast-fluctuating forces more frequently than expensive but slow-fluctuating forces. Since forces are only evaluated in the V step, we represent this by including in our “alphabet” V0, V1, …) When the system contains holonomic constraints, these steps are confined to the constraint manifold.

Parameters
  • alchemical_functions (dict of strings) – key: value pairs such as “global_parameter” : function_of_lambda where function_of_lambda is a Lepton-compatible string that depends on the variable “lambda”

  • splitting (string, default: “H V R O V R H”) – Sequence of R, V, O (and optionally V{i}), and { }substeps to be executed each timestep. There is also an H option, which increments the global parameter lambda by 1/nsteps_neq for each step. Forces are only used in V-step. Handle multiple force groups by appending the force group index to V-steps, e.g. “V0” will only use forces from force group 0. “V” will perform a step using all forces.( will cause metropolization, and must be followed later by a ).

  • temperature (numpy.unit.Quantity compatible with kelvin, default: 298.0*simtk.unit.kelvin) – Fictitious “bath” temperature

  • collision_rate (numpy.unit.Quantity compatible with 1/picoseconds, default: 91.0/simtk.unit.picoseconds) – Collision rate

  • timestep (numpy.unit.Quantity compatible with femtoseconds, default: 1.0*simtk.unit.femtoseconds) – Integration timestep

  • constraint_tolerance (float, default: 1.0e-8) – Tolerance for constraint solver

  • measure_shadow_work (boolean, default: False) – Accumulate the shadow work performed by the symplectic substeps, in the global shadow_work

  • measure_heat (boolean, default: True) – Accumulate the heat exchanged with the bath in each step, in the global heat

  • nsteps_neq (int, default: 100) – Number of steps in nonequilibrium protocol. Default 100

  • prop_lambda (float (Default = 0.3)) – Defines the region in which to add extra propagation steps during the NCMC simulation from the midpoint 0.5. i.e. A value of 0.3 will add extra steps from lambda 0.2 to 0.8.

  • nprop (int (Default: 1)) – Controls the number of propagation steps to add in the lambda region defined by prop_lambda.

_kinetic_energy

This is 0.5*m*v*v by default, and is the expression used for the kinetic energy

Type

str

Examples

  • g-BAOAB:

    splitting=”R V O H O V R”

  • VVVR

    splitting=”O V R H R V O”

  • VV

    splitting=”V R H R V”

  • An NCMC algorithm with Metropolized integrator:

    splitting=”O { V R H R V } O”

References

[Nilmeier, et al. 2011] Nonequilibrium candidate Monte Carlo is an efficient tool for equilibrium simulation

[Leimkuhler and Matthews, 2015] Molecular dynamics: with deterministic and stochastic numerical methods, Chapter 7

reset()[source]

Manually reset protocol work and other statistics.

Utilities

Provides a host of utility functions for the BLUES engine.

Authors: Samuel C. Gill Contributors: Nathan M. Lim, David L. Mobley

blues.utils.logger = <Logger blues.utils (WARNING)>
blues.utils.amber_selection_to_atomidx(structure, selection)[source]

Converts AmberMask selection [amber-syntax] to list of atom indices.

Parameters
  • structure (parmed.Structure()) – Structure of the system, used for atom selection.

  • selection (str) – AmberMask selection that gets converted to a list of atom indices.

Returns

mask_idx (list of int) – List of atom indices.

References

amber-syntax(1,2,3,4)
  1. Swails, ParmEd Documentation (2015). http://parmed.github.io/ParmEd/html/amber.html#amber-mask-syntax

blues.utils.check_amber_selection(structure, selection)[source]

Given a AmberMask selection (str) for selecting atoms to freeze or restrain, check if it will actually select atoms. If the selection produces None, suggest valid residues or atoms.

Parameters
  • structure (parmed.Structure) – The structure of the simulated system

  • selection (str) – The selection string uses Amber selection syntax to select atoms to be restrained/frozen during simulation.

  • logger (logging.Logger) – Records information or streams to terminal.

blues.utils.atomidx_to_atomlist(structure, mask_idx)[source]

Goes through the structure and matches the previously selected atom indices to the atom type.

Parameters
  • structure (parmed.Structure()) – Structure of the system, used for atom selection.

  • mask_idx (list of int) – List of atom indices.

Returns

atom_list (list of atoms) – The atoms that were previously selected in mask_idx.

blues.utils.parse_unit_quantity(unit_quantity_str)[source]

Utility for parsing parameters from the YAML file that require units.

Parameters

unit_quantity_str (str) – A string specifying a quantity and it’s units. i.e. ‘3.024 * daltons’

Returns

unit_quantity (simtk.unit.Quantity) – i.e unit.Quantity(3.024, unit=dalton)

blues.utils.atomIndexfromTop(resname, topology)[source]

Get atom indices of a ligand from OpenMM Topology.

Parameters
  • resname (str) – resname that you want to get the atom indicies for (ex. ‘LIG’)

  • topology (str, optional, default=None) – path of topology file. Include if the topology is not included in the coord_file

Returns

lig_atoms (list of ints) – list of atoms in the coordinate file matching lig_resname

blues.utils.getMasses(atom_subset, topology)[source]

Returns a list of masses of the specified ligand atoms.

Parameters

topology (parmed.Topology) – ParmEd topology object containing atoms of the system.

Returns

  • masses (1xn numpy.array * simtk.unit.dalton) – array of masses of len(self.atom_indices), denoting the masses of the atoms in self.atom_indices

  • totalmass (float * simtk.unit.dalton) – The sum of the mass found in masses

blues.utils.getCenterOfMass(positions, masses)[source]

Returns the calculated center of mass of the ligand as a numpy.array

Parameters
  • positions (nx3 numpy array * simtk.unit compatible with simtk.unit.nanometers) – ParmEd positions of the atoms to be moved.

  • masses (numpy.array) – numpy.array of particle masses

Returns

center_of_mass (numpy array * simtk.unit compatible with simtk.unit.nanometers) – 1x3 numpy.array of the center of mass of the given positions

blues.utils.saveContextFrame(context, topology, outfname)[source]

Extracts a ParmEd structure and writes the frame given an OpenMM Simulation object.

Parameters
  • simulation (openmm.Simulation) – The OpenMM Simulation to write a frame from.

  • outfname (str) – The output file name to save the simulation frame from. Supported extensions:

    • PDB (.pdb, pdb)

    • PDBx/mmCIF (.cif, cif)

    • PQR (.pqr, pqr)

    • Amber topology file (.prmtop/.parm7, amber)

    • CHARMM PSF file (.psf, psf)

    • CHARMM coordinate file (.crd, charmmcrd)

    • Gromacs topology file (.top, gromacs)

    • Gromacs GRO file (.gro, gro)

    • Mol2 file (.mol2, mol2)

    • Mol3 file (.mol3, mol3)

    • Amber ASCII restart (.rst7/.inpcrd/.restrt, rst7)

    • Amber NetCDF restart (.ncrst, ncrst)

blues.utils.print_host_info(context)[source]

Prints hardware related information for the openmm.Simulation

Parameters

simulation (openmm.Simulation) – The OpenMM Simulation to write a frame from.

blues.utils.get_data_filename(package_root, relative_path)[source]

Get the full path to one of the reference files in testsystems. In the source distribution, these files are in blues/data/, but on installation, they’re moved to somewhere in the user’s python site-packages directory. Adapted from: https://github.com/open-forcefield-group/smarty/blob/master/smarty/utils.py

Parameters
  • package_root (str) – Name of the included/installed python package

  • relative_path (str) – Path to the file within the python package

Returns

fn (str) – Full path to file

Storage

blues.storage.setup_logging(filename=None, yml_path='logging.yml', default_level=20, env_key='LOG_CFG')[source]

Setup logging configuration

blues.storage.addLoggingLevel(levelName, levelNum, methodName=None)[source]

Comprehensively adds a new logging level to the logging module and the currently configured logging class.

levelName becomes an attribute of the logging module with the value levelNum. methodName becomes a convenience method for both logging itself and the class returned by logging.getLoggerClass() (usually just logging.Logger). If methodName is not specified, levelName.lower() is used.

To avoid accidental clobberings of existing attributes, this method will raise an AttributeError if the level name is already an attribute of the logging module or if the method name is already present

Parameters
  • levelName (str) – The new level name to be added to the logging module.

  • levelNum (int) – The level number indicated for the logging module.

  • methodName (str, default=None) – The method to call on the logging module for the new level name. For example if provided ‘trace’, you would call logging.trace().

Example

>>> addLoggingLevel('TRACE', logging.DEBUG - 5)
>>> logging.getLogger(__name__).setLevel("TRACE")
>>> logging.getLogger(__name__).trace('that worked')
>>> logging.trace('so did this')
>>> logging.TRACE
5
blues.storage.init_logger(logger, level=20, stream=True, outfname='blues-20190618-164116')[source]

Initialize the Logger module with the given logger_level and outfname.

Parameters
  • logger (logging.getLogger()) – The root logger object if it has been created already.

  • level (logging.<LEVEL>) – Valid options for <LEVEL> would be DEBUG, INFO, warningING, ERROR, CRITICAL.

  • stream (bool, default = True) – If True, the logger will also stream information to sys.stdout as well as the output file.

  • outfname (str, default = time.strftime(“blues-%Y%m%d-%H%M%S”)) – The output file path prefix to store the logged data. This will always write to a file with the extension .log.

Returns

logger (logging.getLogger()) – The logging object with additional Handlers added.

class blues.storage.NetCDF4Storage(file, reportInterval=1, frame_indices=[], crds=True, vels=False, frcs=False, protocolWork=False, alchemicalLambda=False)[source]

Class to read or write NetCDF trajectory files Inherited from parmed.openmm.reporters.NetCDFReporter

Parameters
  • file (str) – Name of the file to write the trajectory to

  • reportInterval (int) – How frequently to write a frame to the trajectory

  • frame_indices (list, frame numbers for writing the trajectory) – If this reporter is used for the NCMC simulation, 0.5 will report at the moveStep and -1 will record at the last frame.

  • crds (bool=True) – Should we write coordinates to this trajectory? (Default True)

  • vels (bool=False) – Should we write velocities to this trajectory? (Default False)

  • frcs (bool=False) – Should we write forces to this trajectory? (Default False)

  • protocolWork (bool=False,) – Write the protocolWork for the alchemical process in the NCMC simulation

  • alchemicalLambda (bool=False,) – Write the alchemicalLambda step for the alchemical process in the NCMC simulation.

describeNextReport(context_state)[source]

Get information about the next report this object will generate.

Parameters

context_state (openmm.State) – The current state of the context

Returns

nsteps, pos, vel, frc, ene (int, bool, bool, bool, bool) – nsteps is the number of steps until the next report pos, vel, frc, and ene are flags indicating whether positions, velocities, forces, and/or energies are needed from the Context

report(context_state, integrator)[source]

Generate a report.

Parameters
  • context_state (openmm.State) – The current state of the context

  • integrator (openmm.Integrator) – The integrator belonging to the given context

class blues.storage.BLUESStateDataStorage(file=None, reportInterval=1, frame_indices=[], title='', step=False, time=False, potentialEnergy=False, kineticEnergy=False, totalEnergy=False, temperature=False, volume=False, density=False, progress=False, remainingTime=False, speed=False, elapsedTime=False, separator='t', systemMass=None, totalSteps=None, protocolWork=False, alchemicalLambda=False, currentIter=False)[source]

StateDataReporter outputs information about a simulation, such as energy and temperature, to a file. To use it, create a StateDataReporter, then add it to the Simulation’s list of reporters. The set of data to write is configurable using boolean flags passed to the constructor. By default the data is written in comma-separated-value (CSV) format, but you can specify a different separator to use. Inherited from openmm.app.StateDataReporter

Parameters
  • file (string or file) – The file to write to, specified as a file name or file-like object (Logger)

  • reportInterval (int) – The interval (in time steps) at which to write frames

  • frame_indices (list, frame numbers for writing the trajectory)

  • title (str,) – Text prefix for each line of the report. Used to distinguish between the NCMC and MD simulation reports.

  • step (bool=False) – Whether to write the current step index to the file

  • time (bool=False) – Whether to write the current time to the file

  • potentialEnergy (bool=False) – Whether to write the potential energy to the file

  • kineticEnergy (bool=False) – Whether to write the kinetic energy to the file

  • totalEnergy (bool=False) – Whether to write the total energy to the file

  • temperature (bool=False) – Whether to write the instantaneous temperature to the file

  • volume (bool=False) – Whether to write the periodic box volume to the file

  • density (bool=False) – Whether to write the system density to the file

  • progress (bool=False) – Whether to write current progress (percent completion) to the file. If this is True, you must also specify totalSteps.

  • remainingTime (bool=False) – Whether to write an estimate of the remaining clock time until completion to the file. If this is True, you must also specify totalSteps.

  • speed (bool=False) – Whether to write an estimate of the simulation speed in ns/day to the file

  • elapsedTime (bool=False) – Whether to write the elapsed time of the simulation in seconds to the file.

  • separator (string=’,’) – The separator to use between columns in the file

  • systemMass (mass=None) – The total mass to use for the system when reporting density. If this is None (the default), the system mass is computed by summing the masses of all particles. This parameter is useful when the particle masses do not reflect their actual physical mass, such as when some particles have had their masses set to 0 to immobilize them.

  • totalSteps (int=None) – The total number of steps that will be included in the simulation. This is required if either progress or remainingTime is set to True, and defines how many steps will indicate 100% completion.

  • protocolWork (bool=False,) – Write the protocolWork for the alchemical process in the NCMC simulation

  • alchemicalLambda (bool=False,) – Write the alchemicalLambda step for the alchemical process in the NCMC simulation.

describeNextReport(context_state)[source]

Get information about the next report this object will generate.

Parameters

context_state (openmm.State) – The current state of the context

Returns

nsteps, pos, vel, frc, ene (int, bool, bool, bool, bool) – nsteps is the number of steps until the next report pos, vel, frc, and ene are flags indicating whether positions, velocities, forces, and/or energies are needed from the Context

report(context_state, integrator)[source]

Generate a report.

Parameters
  • context_state (openmm.State) – The current state of the context

  • integrator (openmm.Integrator) – The integrator belonging to the given context

Formats

class blues.formats.LoggerFormatter[source]

Formats the output of the logger.Logger object. Allows customization for customized logging levels. This will add a custom level ‘REPORT’ to all custom BLUES reporters from the blues.reporters module.

Examples

Below we add a custom level ‘REPORT’ and have the logger module stream the message to sys.stdout without any additional information to our custom reporters from the blues.reporters module

>>> from blues import reporters
>>> from blues.formats import LoggerFormatter
>>> import logging, sys
>>> logger = logging.getLogger(__name__)
>>> reporters.addLoggingLevel('REPORT', logging.WARNING - 5)
>>> fmt = LoggerFormatter(fmt="%(message)s")
>>> stdout_handler = logging.StreamHandler(stream=sys.stdout)
>>> stdout_handler.setFormatter(fmt)
>>> logger.addHandler(stdout_handler)
>>> logger.report('This is a REPORT call')
    This is a REPORT call
>>> logger.info('This is an INFO call')
    INFO: This is an INFO call
format(record)[source]

Format the specified record as text.

The record’s attribute dictionary is used as the operand to a string formatting operation which yields the returned string. Before formatting the dictionary, a couple of preparatory steps are carried out. The message attribute of the record is computed using LogRecord.getMessage(). If the formatting string uses the time (as determined by a call to usesTime(), formatTime() is called to format the event time. If there is exception information, it is formatted using formatException() and appended to the message.

class blues.formats.NetCDF4Traj(fname, mode='r')[source]

Extension of parmed.amber.netcdffiles.NetCDFTraj to allow proper file flushing. Requires the netcdf4 library (not scipy), install with conda install -c conda-forge netcdf4 .

Parameters
  • fname (str) – File name for the trajectory file

  • mode (str, default=’r’) – The mode to open the file in.

flush()[source]

Flush buffered data to disc.

classmethod open_new(fname, natom, box, crds=True, vels=False, frcs=False, remd=None, remd_dimension=None, title='', protocolWork=False, alchemicalLambda=False)[source]

Opens a new NetCDF file and sets the attributes

Parameters
  • fname (str) – Name of the new file to open (overwritten)

  • natom (int) – Number of atoms in the restart

  • box (bool) – Indicates if cell lengths and angles are written to the NetCDF file

  • crds (bool, default=True) – Indicates if coordinates are written to the NetCDF file

  • vels (bool, default=False) – Indicates if velocities are written to the NetCDF file

  • frcs (bool, default=False) – Indicates if forces are written to the NetCDF file

  • remd (str, default=None) – ‘T[emperature]’ if replica temperature is written ‘M[ulti]’ if Multi-D REMD information is written None if no REMD information is written

  • remd_dimension (int, default=None) – If remd above is ‘M[ulti]’, this is how many REMD dimensions exist

  • title (str, default=’‘) – The title of the NetCDF trajectory file

  • protocolWork (bool, default=False) – Indicates if protocolWork from the NCMC simulation should be written to the NetCDF file

  • alchemicalLambda (bool, default=False) – Indicates if alchemicalLambda from the NCMC simulation should be written to the NetCDF file

property protocolWork

Store the accumulated protocolWork from the NCMC simulation as property.

add_protocolWork(stuff)[source]

Adds the time to the current frame of the NetCDF file

Parameters

stuff (float or time-dimension Quantity) – The time to add to the current frame

property alchemicalLambda

Store the current alchemicalLambda (0->1.0) from the NCMC simulation as property.

add_alchemicalLambda(stuff)[source]

Adds the time to the current frame of the NetCDF file

Parameters

stuff (float or time-dimension Quantity) – The time to add to the current frame

Developer Guide

UML Diagram

_images/uml.png

OpenMMTools Objects

Highlighted in red are 3 objects that we use from the openmmtools library. They are the ThermodynamicState, CompoundThermodynamicState, and SamplerState objects. For more details of each class, please see the official openmmtools documentation.

Briefly, the ThermodynamicState class represents the portion of the state of an openmm.Context that does not change with integration (i.e. particles, temperature, or pressure). The CompoundThermodynamicState class is essentially the same as the ThermodynamicState class except in this package, it is used for the handling the openmmtools.alchemy.AlchemicalState object. Thus, in order to create the CompoundThermodynamicState, one needs to first create the plain ThermodynamicState object first. If a CompoundThermodynamicState object is not provided to the blues.ncmc.BLUESSampler class, one is created using the default parameters from the given ThermodynamicState. Lastly, the SamplerState class represents the state of an openmm.Context which does change with integration (i.e positions, velocities, and box_vectors). Within the context of this package, the SamplerState is used to sync information between the MD and NCMC simulations.

Integrators and Moves

Integrators Integrators are the lowest level openmm objects this package interacts with, where each intergrator is tied to an openmm.Context that it advances. Each integrator is generated by using the embedded function _get_integrator() function within each move class. The integrators will control whether we are carrying out the Non-equilibirum Candidate Monte Carlo (NCMC) or Molecular Dynamics (MD) simulation.

Every move class has 3 hidden methods: _get_integrator() for generating the integrator of each move class, _before_integration() for performing any necessary setup before integration, and _after_integration() for performing any cleanup or data collection after integration. Every move class also contains the apply() method which carries out calls to the 3 hidden methods and stepping with the integrator.

In this package, we provide the move class blues.ncmc.ReportLangevinDynamicsMove to execute the MD simulation. As the name suggests, this will carry forward the MD simulation using Langevin dynamics, by generating an openmm.LangevinIntegrator. This class is essentially the same as the openmmtools.LangevinDynamicsMove but with modifications to the apply() method which allows storing simulation data for the MD simulation.

For running the NCMC simulation, we provide a custom integrator blues.integrator.AlchemicalExternalLangevinIntegrator. This integrator is generated in every move which inherits from the base class blues.ncmc.NCMCMove. Every class which inherits from the base move class must override the _propose_positions() method. If necessary, one can override the _before_integration() and _after_integration() methods for any necessary setup and cleanup. Again, these hidden methods will be called when a call is made to the apply() method from the move class.

Moves In order to implement custom NCMC moves, inherit from the base class and override the _propose_positions() method. This method is expected to take in a positions array of the atoms to be modified and returns the proposed positions. In pseudo-code, it would look something like:

from blues.ncmc import NCMCMove
class CustomNCMCMove(NCMCMove):
    def _propose_positions(positions):
        """Add 1 nanometer displacement vector."""
        positions_unit = positions.unit
        unitless_displacement = 1.0 / positions_unit
        displacement_vector = unit.Quantity(np.random.randn(3) * unitless_displacement_sigma, positions_unit)
        proposed_positions = positions + displacement_vector
        return proposed_positions

In this package, we provide the blues.ncmc.RandomLigandRotationMove in order to propose a random ligand rotation about the center of mass. This class overrides the _before_integration() method for obtaining the masses of the ligand and overrides the _propose_positions() function for generating the rotated coordinates. Updating the context with the rotated coordinates is handled when the apply() method is called in the move class. Code snippet of the class is shown below:

from blues.ncmc import RandomLigandRotationMove
class RandomLigandRotationMove(NCMCMove):
   def _before_integration(self, context, thermodynamic_state):
      """Obtain the masses of the ligand before integration."""
      super(RandomLigandRotationMove, self)._before_integration(context, thermodynamic_state)
      masses, totalmass = utils.getMasses(self.atom_subset, thermodynamic_state.topology)
      self.masses = masses
   def _propose_positions(self, positions):
       # Calculate the center of mass
       center_of_mass = utils.getCenterOfMass(positions, self.masses)
       reduced_pos = positions - center_of_mass
       # Define random rotational move on the ligand
       rand_quat = mdtraj.utils.uniform_quaternion(size=None)
       rand_rotation_matrix = mdtraj.utils.rotation_matrix_from_quaternion(rand_quat)
       # multiply lig coordinates by rot matrix and add back COM translation from origin
       proposed_positions = numpy.dot(reduced_pos, rand_rotation_matrix) * positions.unit + center_of_mass

       return proposed_positions

Since BLUES (v0.2.5) the API has been re-written to be more compatible with the openmmtools API. This means one can turn a regular Markov Chain Monte Carlo (MCMC) move from the openmmtools library into an NCMC move to be used in this package. In this case, one simply needs to make use of dual inheritance, using the blues.ncmc.NCMCMove that we provide and override the _get_integrator() method to generate the NCMC integrator we provide, i.e. blues.integrator.AlchemicalExternalLangevinIntegrator. When using dual inheritance, it is important that you first inherit the desired MCMC move and then the blues.ncmc.NCMCMove class. For example, if we wanted to take the openmmtools.mcmc.MCDisplacementMove class and turn it into an NCMC move, it would look like:

from blues.ncmc import NCMCMove
from openmmtools.mcmc import MCDisplacementMove
class NCMCDisplacementMove(MCDisplacementMove, NCMCMove):
    def _get_integrator(self, thermodynamic_state):
        return NCMCMove._get_integrator(self,thermodynamic_state)

BLUESSampler

The blues.ncmc.BLUESSampler object ties together all the previously mentioned state objects and the two move classes for running the NCMC+MD simulation. Details of the parameters for this class are listed in the Modules documentation. For a more detailed example of it’s usage see the Usage documentation.

To be explicit, the input parameters refer to the objects below:

  • thermodynamic_state : openmmtools.states.ThermodynamicState

  • alch_thermodynamic_state : openmmtools.states.CompoundThermodynamicState

  • sampler_state : openmmtools.states.SamplerState

  • dynamics_move : blues.ncmc.ReportLangevinDynamicsMove

  • ncmc_move : blues.ncmc.RandomLigandRotationMove

  • topology : openmm.Topology

When the run() method in the blues.ncmc.BLUESSampler is called the following takes place:

  • Initialization:
    • _print_host_info() : Information print out of host

    • _printSimulationTiming() : Calculation of total number of steps

    • equil() : Equilibration

  • BLUES iterations:
    • ncmc_move.apply() : NCMC simulation

    • _acceptRejectMove() : Metropolization

    • dynamics_move.apply() : MD Simulation

A code snippet of the run() method is shown below:

def run(self, n_iterations=1):
    context, integrator = cache.global_context_cache.get_context(self.thermodynamic_state)
    utils.print_host_info(context)
    self._printSimulationTiming(n_iterations)
    if self.iteration == 0:
        self.equil(1)

    self.iteration = 0
    for iteration in range(n_iterations):
        self.ncmc_move.apply(self.alch_thermodynamic_state, self.sampler_state)

        self._acceptRejectMove()

        self.dynamics_move.apply(self.thermodynamic_state, self.sampler_state)

        self.iteration += 1

Initialization

The first thing that occurs when run() is called is the initialization stage. During this stage, a call is made to utils.print_host_info() and the _printSimulationTiming() method which will print out some information about the host machine and the total number of force evaluations and simulation time. The output will look something like below:

In the blues.ncmc.BLUESSampler class, there is an equil() method which lets you run iterations of just the MD simulation in order to equilibrate your system before running the NCMC+MD hybrid simulation. An equilibration iteration, in this case is controlled by the given attribute n_steps from the dynamics_move class. For example, if I create a blues.ncmc.ReportLangevinDynamicsMove class with n_steps=20 and call the blues.ncmc.BLUESSampler.equil(n_iterations=100), this will run (n_steps x n_iterations) or 2000 steps of MD or 2 picoseconds of MD simulation time. When the run() method is called without a prior call to the equil() method, the class will always run 1 iteration of equilibration in order to set the initial conditions in the MD simulation. This is required prior to running the NCMC simulation.

BLUES Iterations

NCMC Simulation

After at least 1 iteration of equilibration, the blues.ncmc.BLUESSampler class will then proceed forward with running iterations of the NCMC+MD hybrid simulation. It will first run the NCMC simulation by calling the apply() method on the ncmc_move class or, for sake of this example, the blues.ncmc.RandomLigandRotationMove class. The apply() method for the ncmc_move will take in the alch_thermodynamic_state parameter or specifically the openmmtools.states.CompoundThermodynamicState object.

A code snippet of the ncmc_move.apply() method is shown below:

def apply(self, thermodynamic_state, sampler_state):
    if self.context_cache is None:
        context_cache = cache.global_context_cache
    else:
        context_cache = self.context_cache
    integrator = self._get_integrator(thermodynamic_state)
    context, integrator = context_cache.get_context(thermodynamic_state, integrator)
    sampler_state.apply_to_context(context, ignore_velocities=False)
    self._before_integration(context, thermodynamic_state)
    try:
        endStep = self.currentStep + self.n_steps
        while self.currentStep < endStep:
            alch_lambda = integrator.getGlobalVariableByName('lambda')
            if alch_lambda == 0.5:
                sampler_state.update_from_context(context)
                proposed_positions = self._propose_positions(sampler_state.positions[self.atom_subset])
                sampler_state.positions[self.atom_subset] = proposed_positions
                sampler_state.apply_to_context(context, ignore_velocities=True)

            nextSteps = endStep - self.currentStep
            stepsToGo = nextSteps
            while stepsToGo > 10:
                integrator.step(10)
                stepsToGo -= 10
            integrator.step(stepsToGo)
            self.currentStep += nextSteps
    except Exception as e:
        print(e)
    else:
        context_state = context.getState(
            getPositions=True,
            getVelocities=True,
            getEnergy=True,
            enforcePeriodicBox=thermodynamic_state.is_periodic)

        self._after_integration(context, thermodynamic_state)
        sampler_state.update_from_context(
            context_state, ignore_positions=False, ignore_velocities=False, ignore_collective_variables=True)
        sampler_state.update_from_context(
            context, ignore_positions=True, ignore_velocities=True, ignore_collective_variables=False)

When the apply() method on ncmc_move is called, it will first generate the blues.integrators.AlchemicalExternalLangevinIntegrator by calling the _get_integrator() method inherent to the move class. Then, it will create (or fetch from the context_cache) a corresponding openmm.Context given the alch_thermodynamic_state. Next, the sampler_state which contains the last state of the MD simulation is synced to the newly created context from the corresponding alch_thermodynamic_state. Particularly, the context will be updated with the box_vectors, positions, and velocities from the last state of the MD simulation.

Just prior to integration, a call is made to the _before_integration() method in order to store the initial energies, positions, box_vectors and the masses of the ligand to be rotated. Then, we actually step with the integrator where we perform the ligand rotation when lambda has reached the half-way point or lambda=0.5, continuing integration until we have completed the n_steps. After the integration steps have been completed, a call is made to the _after_integration() method to store the final energies, positions, and box_vectors. Lastly, the sampler_state is updated from the final state of the context.

Metropolization

After advancing the NCMC simulation, a call is made to the _acceptRejectMove() method embedded in the blues.ncmc.BLUESSampler class for metropolization of the proposed move.

A code snippet of the _acceptRejectMove() is shown below:

def _acceptRejectMove(self):
    integrator = self.dynamics_move._get_integrator(self.thermodynamic_state)
    context, integrator = cache.global_context_cache.get_context(self.thermodynamic_state, integrator)
    self.sampler_state.apply_to_context(context, ignore_velocities=True)
    alch_energy = self.thermodynamic_state.reduced_potential(context)

    correction_factor = (self.ncmc_move.initial_energy - self.dynamics_move.final_energy + alch_energy - self.ncmc_move.final_energy)
    logp_accept = self.ncmc_move.logp_accept
    randnum = numpy.log(numpy.random.random())

    logp_accept = logp_accept + correction_factor
    if (not numpy.isnan(logp_accept) and logp_accept > randnum):
        self.n_accepted += 1
    else:
        self.accept = False
        self.sampler_state.positions = self.ncmc_move.initial_positions
        self.sampler_state.box_vectors = self.ncmc_move.initial_box_vectors

Here, is we compute a correction term for switching between the MD and NCMC integrators and factor this in with natural log of the acceptance probability (logp_accept). Then, a random number is generated in which: the move is accepted if the random number is less than the logp_accept or rejected if greater. When the move is rejected, we set the positions and box_vectors on the sampler_state to the initial positions and box_vectors from the NCMC simulation. If the move is accepted, nothing on the sampler_state is changed so that the following MD simulation will contain the final state of the NCMC simulation.

MD Simulation

After metropolization of the previously proposed move, a call is made to the apply() method on the given dynamics_move object. In this example, this would refer to the blues.ncmc.ReportLangevinDynamicsMove class to run the MD simulation.

A code snippet of the dynamics_move.apply() method is shown below:

def apply(self, thermodynamic_state, sampler_state):
    if self.context_cache is None:
        context_cache = cache.global_context_cache
    else:
        context_cache = self.context_cache

    integrator = self._get_integrator(thermodynamic_state)
    context, integrator = context_cache.get_context(thermodynamic_state, integrator)
    thermodynamic_state.apply_to_context(context)

    sampler_state.apply_to_context(context, ignore_velocities=self.reassign_velocities)
    if self.reassign_velocities:
        context.setVelocitiesToTemperature(thermodynamic_state.temperature)

    self._before_integration(context, thermodynamic_state)
    try:
        endStep = self.currentStep + self.n_steps
        while self.currentStep < endStep:
            nextSteps = endStep - self.currentStep
            stepsToGo = nextSteps
            while stepsToGo > 10:
                integrator.step(10)
                stepsToGo -= 10
            integrator.step(stepsToGo)
            self.currentStep += nextSteps

    except Exception as e:
        print(e)

    else:
        context_state = context.getState(
            getPositions=True,
            getVelocities=True,
            getEnergy=True,
            enforcePeriodicBox=thermodynamic_state.is_periodic)
        self._after_integration(context, thermodynamic_state)
        sampler_state.update_from_context(
            context_state, ignore_positions=False, ignore_velocities=False, ignore_collective_variables=True)
        sampler_state.update_from_context(
            context, ignore_positions=True, ignore_velocities=True, ignore_collective_variables=False)

When the apply() method is called, a very similar procedure to the NCMC simulation occurs. The first thing that happens is to generate the integrator through a call to _get_integrator(), where in this given class, it will generate an openmm.LangevinIntegrator given the thermodynamic_state parameter. Then, it will create (or fetch from the context_cache) a corresponding openmm.Context given the thermodynamic_state. Next, the sampler_state, which contains the last state of the NCMC simulation if the previous move was accepted or the initial state of the NCMC simulation if the move was rejected, is used to update box_vectors and positions in the newly created openmm.Context. In this case, we reassign the velocities in the MD simulation in order to preserve detailed balance.

Following, a call is made to _before_integration() to store the intial positions, box_vectors and energies and then we carry forward with the integration for n_steps. After the integration steps have been completed, a call is made to the _after_integration() method to store the final energies and positions. Lastly, the sampler_state object is updated from the final state of the MD simulation context.

This completes 1 iteration of the BLUES cycle. Here, the sampler_state is then used to sync the final state of the MD simulation (i.e. box_vectors, positions, and velocities) from the previous iteration to the NCMC simulation of the next iteration. Then, we repeat the cycle of NCMC -> Metropolization -> MD for the given number of iterations.

Indices and tables