Difference between revisions of "Simulating States of Interest"

From AlchemistryWiki
Jump to navigation Jump to search
m
Line 13: Line 13:
 
It is actually rather difficult for your simulation to ''start'' in equilibrium, especially for explicit solvent systems where the solvent is automatically generated. Many simulation software packages actually run or recommend you run an energy minimization step prior to beginning to correct for possible overlapping molecules. These alone are not enough to sufficiently bring the system into equilibrium. The best way to do this is to actually run the simulation for a period of time and then simply exclude those samples from analysis. This equilibration '''must be done at every intermediate''' too.
 
It is actually rather difficult for your simulation to ''start'' in equilibrium, especially for explicit solvent systems where the solvent is automatically generated. Many simulation software packages actually run or recommend you run an energy minimization step prior to beginning to correct for possible overlapping molecules. These alone are not enough to sufficiently bring the system into equilibrium. The best way to do this is to actually run the simulation for a period of time and then simply exclude those samples from analysis. This equilibration '''must be done at every intermediate''' too.
  
One efficient scheme to get close to equilibration at each lambda is to run short (10-100 ps) simulations at each state, then restart/begin collecting data with the configuration at the end of this time window. This will not guarantee a full relaxation of the system, but it does significantly help reduce potential instabilities in your simulation; full relaxation can take 100s to 1000s of ps or longer in some cases.{{cite|Fujitani2005|Fujitani, H., Tanida, Y., Ito, M., Shirts, M. R., Jayachandran, G., Snow, C. D., Sorin, E. J., and Pande, V. S. (2005) Direct calculation of the binding free energies of FKBP ligands. ''J. Chem. Phys.'' 123, 084108.|http://www.citeulike.org/group/14929/article/9052204}} As <math>\lambda</math> changes, the volume should be allowed to change as well so the solvent density of the system does not change as a result of state. Liquids are incompressible, which means that any small change in volume can have stark changes in the thermodynamic properties. Even when running at [[Theoretic_Principals#Nomenclature and Variables|NVT]] conditions, the average volume from an [[Theoretic_Principals#Nomenclature and Variables|NPT]] equilibration should be used as box fluctuations can cause free energy differences of 0.1-0.3 kcal/mol for typical small molecule solvation.
+
One efficient scheme to get close to equilibration at each lambda is to run short (10-100 ps) simulations at each state, then restart/begin collecting data with the configuration at the end of this time window. This will not guarantee a full relaxation of the system, but it does significantly help reduce potential instabilities in your simulation; full relaxation can take 100s to 1000s of ps or longer in some cases.{{cite|Fujitani2005}} As <math>\lambda</math> changes, the volume should be allowed to change as well so the solvent density of the system does not change as a result of state. Liquids are incompressible, which means that any small change in volume can have stark changes in the thermodynamic properties. Even when running at [[Theoretic_Principals#Nomenclature and Variables|NVT]] conditions, the average volume from an [[Theoretic_Principals#Nomenclature and Variables|NPT]] equilibration should be used as box fluctuations can cause free energy differences of 0.1-0.3 kcal/mol for typical small molecule solvation.
  
Solvating small molecules may take 100-500ps, but this is a best case scenario. Large protein-ligand systems started out of equilibrium have very lengthy [[Simulation Information Gathering#Correlation| correlation times]] and may take hundreds of ''nano''seconds to equilibrate. Things to watch for when checking equilibration are the system energy (<math>U</math>), {{#tag:math|\left\langle \mathrm{d} U / \mathrm{d} \lambda \right\rangle}}, and the number of hydrogen bonds (for small molecules) for convergence; the hydrogen bonds may be slow to converge.{{cite|Smith2002|Smith, L. J., Daura, X., and van Gunsteren, W. F. (2002) Assessing equilibration and convergence in biomolecular simulations. ''Proteins: Struct., Funct., Bioinf.'' 48, 487–496.}}
+
Solvating small molecules may take 100-500ps, but this is a best case scenario. Large protein-ligand systems started out of equilibrium have very lengthy [[Simulation Information Gathering#Correlation| correlation times]] and may take hundreds of ''nano''seconds to equilibrate. Things to watch for when checking equilibration are the system energy (<math>U</math>), {{#tag:math|\left\langle \mathrm{d} U / \mathrm{d} \lambda \right\rangle}}, and the number of hydrogen bonds (for small molecules) for convergence; the hydrogen bonds may be slow to converge.{{cite|Smith2002}}
  
 
==Running Nonequilibrium Simulations==
 
==Running Nonequilibrium Simulations==
Although all the methods discussed here require systems to be at equilibrium, there are free energy methods where some thermodynamic variables change over time, and some amount of work, ''W'', is required to make this change. In accordance with basic thermodynamic principals, if the change is done infinitely slowly, the process is considered reversible and the work will be equal to the free energy difference. Since we can't run for an infinite amount of time, ''W'' will not equal the free energy change and the process will not be reversible. Despite this, it is still possible to write out a free energy change in a formalism developed by Jarzynski{{cite|Jarzynski1997|Jarzynski, C. (1997) Nonequilibrium equality for free energy differences. ''Phys. Rev. Lett'' 78, 2690–2693.|http://www.citeulike.org/group/14929/article/9052290}} where the average over the nonequilibrium trajectories which ''started from an equilibrium ensemble'' can give you the free energy; The equation is,
+
Although all the methods discussed here require systems to be at equilibrium, there are free energy methods where some thermodynamic variables change over time, and some amount of work, ''W'', is required to make this change. In accordance with basic thermodynamic principals, if the change is done infinitely slowly, the process is considered reversible and the work will be equal to the free energy difference. Since we can't run for an infinite amount of time, ''W'' will not equal the free energy change and the process will not be reversible. Despite this, it is still possible to write out a free energy change in a formalism developed by Jarzynski{{cite|Jarzynski1997}} where the average over the nonequilibrium trajectories which ''started from an equilibrium ensemble'' can give you the free energy; The equation is,
  
 
:<math>\Delta G = -\beta^{-1}\ln\left\langle\exp\left(-\beta W\right)\right\rangle_0</math>.
 
:<math>\Delta G = -\beta^{-1}\ln\left\langle\exp\left(-\beta W\right)\right\rangle_0</math>.
  
If the switch is instantaneous, then this equation is reduced to [[Exponential Averaging|EXP]] as ''W'' reduces to <math>\Delta U_{ij}</math>. Although this equation is rooted in EXP, it is possible to construct a variant of [[Bennett Acceptance Ratio|BAR]] to do nonequilibrium work, but not MBAR.{{cite|Crooks2000|Crooks, G. E. (2000) Path-ensemble averages in systems driven far from equilibrium. ''Phys. Rev. E'' 61, 2361–2366.|http://www.citeulike.org/group/14929/article/9052148}}{{cite|Shirts2003|Shirts, M. R., Bair, E., Hooker, G., and Pande, V. S. (2003) Equilibrium free energies from nonequilibrium measurements using maximum-likelihood methods. ''Phys. Rev. Lett 91'', 140601.|http://www.citeulike.org/group/14929/article/9052565}} There are a number of studies available for nonequalibrium methods, and it looks as though starting from an equilibrium ensemble is just as if not slightly more efficient than starting from nonequilibrium ensembles.{{cite|Oostenbrink2006|Oostenbrink, C., and van Gunsteren,W. F. (2006) Calculating zeros: Non-equilibrium free energy calculations. ''Chem. Phys.'' 323, 102–108.|http://www.citeulike.org/group/14929/article/9052473}}{{cite|Oberhofer2005|Oberhofer, H., Dellago, C., and Geissler, P. L. (2005) Biased Sampling of Nonequilibrium Trajectories: Can Fast Switching Simulations Outperform Conventional Free Energy Calculation Methods? ''J. Phys. Chem. B'' 109, 6902–6915.|http://www.citeulike.org/group/14929/article/9052461}}
+
If the switch is instantaneous, then this equation is reduced to [[Exponential Averaging|EXP]] as ''W'' reduces to <math>\Delta U_{ij}</math>. Although this equation is rooted in EXP, it is possible to construct a variant of [[Bennett Acceptance Ratio|BAR]] to do nonequilibrium work, but not MBAR.{{cite|Crooks2000}}{{cite|Shirts2003}} There are a number of studies available for nonequalibrium methods, and it looks as though starting from an equilibrium ensemble is just as if not slightly more efficient than starting from nonequilibrium ensembles.{{cite|Oostenbrink2006}}{{cite|Oberhofer2005}}
  
In general, it is not recommended for beginners to work with nonequilibrium simulations as there are a number of intricacies involved with setting them up. Even so, there is substantial work in developing this field as it may help improve the efficiency of free energy calculations in general.{{cite|Pohorille2010|Pohorille, A., Jarzynski, C., and Chipot, C. (2010) Good Practices in Free-Energy Calculations. ''J. Phys. Chem. B'' 114, 10235–10253.|http://www.citeulike.org/group/14929/article/7540599}}
+
In general, it is not recommended for beginners to work with nonequilibrium simulations as there are a number of intricacies involved with setting them up. Even so, there is substantial work in developing this field as it may help improve the efficiency of free energy calculations in general.{{cite|Pohorille2010}}
  
 
=Collecting from States of Interest=
 
=Collecting from States of Interest=
Line 37: Line 37:
  
 
=References=
 
=References=
<references />
+
<references>
 +
{{cite|Fujitani2005|Fujitani, H., Tanida, Y., Ito, M., Shirts, M. R., Jayachandran, G., Snow, C. D., Sorin, E. J., and Pande, V. S. (2005) Direct calculation of the binding free energies of FKBP ligands. ''J. Chem. Phys.'' 123, 084108.|http://www.citeulike.org/group/14929/article/9052204}}
 +
 
 +
{{cite|Smith2002|Smith, L. J., Daura, X., and van Gunsteren, W. F. (2002) Assessing equilibration and convergence in biomolecular simulations. ''Proteins: Struct., Funct., Bioinf.'' 48, 487–496.}}
 +
 
 +
{{cite|Jarzynski1997|Jarzynski, C. (1997) Nonequilibrium equality for free energy differences. ''Phys. Rev. Lett'' 78, 2690–2693.|http://www.citeulike.org/group/14929/article/9052290}}
 +
 
 +
{{cite|Crooks2000|Crooks, G. E. (2000) Path-ensemble averages in systems driven far from equilibrium. ''Phys. Rev. E'' 61, 2361–2366.|http://www.citeulike.org/group/14929/article/9052148}}
 +
 
 +
{{cite|Shirts2003|Shirts, M. R., Bair, E., Hooker, G., and Pande, V. S. (2003) Equilibrium free energies from nonequilibrium measurements using maximum-likelihood methods. ''Phys. Rev. Lett 91'', 140601.|http://www.citeulike.org/group/14929/article/9052565}}
 +
 
 +
{{cite|Oostenbrink2006|Oostenbrink, C., and van Gunsteren,W. F. (2006) Calculating zeros: Non-equilibrium free energy calculations. ''Chem. Phys.'' 323, 102–108.|http://www.citeulike.org/group/14929/article/9052473}}
 +
 
 +
{{cite|Oberhofer2005|Oberhofer, H., Dellago, C., and Geissler, P. L. (2005) Biased Sampling of Nonequilibrium Trajectories: Can Fast Switching Simulations Outperform Conventional Free Energy Calculation Methods? ''J. Phys. Chem. B'' 109, 6902–6915.|http://www.citeulike.org/group/14929/article/9052461}}
 +
 
 +
{{cite|Pohorille2010|Pohorille, A., Jarzynski, C., and Chipot, C. (2010) Good Practices in Free-Energy Calculations. ''J. Phys. Chem. B'' 114, 10235–10253.|http://www.citeulike.org/group/14929/article/7540599}}
 +
</references>

Revision as of 18:24, 18 June 2013



This section should serve as a guide for the preparation of the simulation as a whole and data collection, not as an answer to the question of "which simulation software should I use?" So long as the software can handle free energy calculations (or you can make it), and collects the information the analysis method that you choose needs, then any software is acceptable. Some software may be more suited to certain methods than others, but analysis of this is beyond the scope of this page.

There are three core things you need when running simulations (although more will be discussed below); those items are:

  1. Simulations must be run at equilibrium. Special rules apply when running nonequilibrium simulations.
  2. All samples must be collected from states of interest.
  3. Samples used in the analysis must be uncorrelated and independent.

Running Simulations at Equilibrium

All simulations used for analysis with any of these methods should be done at equilibrium. This is true not only for the end points, but also for every intermediate state you have. Because rare events are often given such a large importance in most simulation packages, running nonequilibrium simulations will give you the wrong results. There are some methods which require nonequilibrium conditions, but even those must start at equilibrium before perturbing the system. There are no hard and fast rules for determining exactly how long one needs to run before equilibration time, so you must use your expertise to ensure that the system is correctly equilibrated.

It is actually rather difficult for your simulation to start in equilibrium, especially for explicit solvent systems where the solvent is automatically generated. Many simulation software packages actually run or recommend you run an energy minimization step prior to beginning to correct for possible overlapping molecules. These alone are not enough to sufficiently bring the system into equilibrium. The best way to do this is to actually run the simulation for a period of time and then simply exclude those samples from analysis. This equilibration must be done at every intermediate too.

One efficient scheme to get close to equilibration at each lambda is to run short (10-100 ps) simulations at each state, then restart/begin collecting data with the configuration at the end of this time window. This will not guarantee a full relaxation of the system, but it does significantly help reduce potential instabilities in your simulation; full relaxation can take 100s to 1000s of ps or longer in some cases.[1] As [math]\displaystyle{ \lambda }[/math] changes, the volume should be allowed to change as well so the solvent density of the system does not change as a result of state. Liquids are incompressible, which means that any small change in volume can have stark changes in the thermodynamic properties. Even when running at NVT conditions, the average volume from an NPT equilibration should be used as box fluctuations can cause free energy differences of 0.1-0.3 kcal/mol for typical small molecule solvation.

Solvating small molecules may take 100-500ps, but this is a best case scenario. Large protein-ligand systems started out of equilibrium have very lengthy correlation times and may take hundreds of nanoseconds to equilibrate. Things to watch for when checking equilibration are the system energy ([math]\displaystyle{ U }[/math]), [math]\displaystyle{ \left\langle \mathrm{d} U / \mathrm{d} \lambda \right\rangle }[/math], and the number of hydrogen bonds (for small molecules) for convergence; the hydrogen bonds may be slow to converge.[2]

Running Nonequilibrium Simulations

Although all the methods discussed here require systems to be at equilibrium, there are free energy methods where some thermodynamic variables change over time, and some amount of work, W, is required to make this change. In accordance with basic thermodynamic principals, if the change is done infinitely slowly, the process is considered reversible and the work will be equal to the free energy difference. Since we can't run for an infinite amount of time, W will not equal the free energy change and the process will not be reversible. Despite this, it is still possible to write out a free energy change in a formalism developed by Jarzynski[3] where the average over the nonequilibrium trajectories which started from an equilibrium ensemble can give you the free energy; The equation is,

[math]\displaystyle{ \Delta G = -\beta^{-1}\ln\left\langle\exp\left(-\beta W\right)\right\rangle_0 }[/math].

If the switch is instantaneous, then this equation is reduced to EXP as W reduces to [math]\displaystyle{ \Delta U_{ij} }[/math]. Although this equation is rooted in EXP, it is possible to construct a variant of BAR to do nonequilibrium work, but not MBAR.[4][5] There are a number of studies available for nonequalibrium methods, and it looks as though starting from an equilibrium ensemble is just as if not slightly more efficient than starting from nonequilibrium ensembles.[6][7]

In general, it is not recommended for beginners to work with nonequilibrium simulations as there are a number of intricacies involved with setting them up. Even so, there is substantial work in developing this field as it may help improve the efficiency of free energy calculations in general.[8]

Collecting from States of Interest

This may seem like a unnecessary point, but it is worth mentioning. The main purpose of stating this is to draw attention to the sensitivity of parameters you choose to define your "end state" at. If a given set of parameters is [math]\displaystyle{ \lambda }[/math] dependent, then the total change in free energy could well be very large. As such, one needs to ensure that the changes they are inducing to a molecule are as few as possible, or at the very least as efficient as possible.

Take for instance treating your simulation's long range electrostatics with particle mesh Ewald (PME). Under non-alchemical transformations, there are several "standard" choices for the parameters which are perfectly acceptable for most simulations. However, if you use this "standard" set when modifying partial charges, you can get significant energy differences up to 4 kcal/mol on some molecules. This just shows one example of how parameters that are considered less important in regular simulations are now important in free energy simulations. The general rule to keep in mind for this is: if the potential energy is changed by the change of a parameter, then dependence of free energy on this parameter should be checked.

Independent and Uncorrelated Samples

The samples must be independent, meaning they are uncorrelated in time; this is a basic assumption of all the theories presented here in the fundamentals section. However, for all but the simplest of systems, completely independent samples can be very difficult to generate. For protein-ligand binding affinities, the time scale for some motions may be 100s of ns, meaning truly uncorrelated samples may be impossible to generate in a reasonable amount of time with today's simulation technology. In this case, free energy calculations might provide some useful information, but will only be approximations to the correct free energy for that model and cannot be considered reliable.

There are methods for subsampling simulated states and finding correlation times to work around the independent sampling problem, but those have been relegated to their own section.

References

  1. Fujitani, H., Tanida, Y., Ito, M., Shirts, M. R., Jayachandran, G., Snow, C. D., Sorin, E. J., and Pande, V. S. (2005) Direct calculation of the binding free energies of FKBP ligands. J. Chem. Phys. 123, 084108. - Find at Cite-U-Like
  2. Smith, L. J., Daura, X., and van Gunsteren, W. F. (2002) Assessing equilibration and convergence in biomolecular simulations. Proteins: Struct., Funct., Bioinf. 48, 487–496.
  3. Jarzynski, C. (1997) Nonequilibrium equality for free energy differences. Phys. Rev. Lett 78, 2690–2693. - Find at Cite-U-Like
  4. Crooks, G. E. (2000) Path-ensemble averages in systems driven far from equilibrium. Phys. Rev. E 61, 2361–2366. - Find at Cite-U-Like
  5. Shirts, M. R., Bair, E., Hooker, G., and Pande, V. S. (2003) Equilibrium free energies from nonequilibrium measurements using maximum-likelihood methods. Phys. Rev. Lett 91, 140601. - Find at Cite-U-Like
  6. Oostenbrink, C., and van Gunsteren,W. F. (2006) Calculating zeros: Non-equilibrium free energy calculations. Chem. Phys. 323, 102–108. - Find at Cite-U-Like
  7. Oberhofer, H., Dellago, C., and Geissler, P. L. (2005) Biased Sampling of Nonequilibrium Trajectories: Can Fast Switching Simulations Outperform Conventional Free Energy Calculation Methods? J. Phys. Chem. B 109, 6902–6915. - Find at Cite-U-Like
  8. Pohorille, A., Jarzynski, C., and Chipot, C. (2010) Good Practices in Free-Energy Calculations. J. Phys. Chem. B 114, 10235–10253. - Find at Cite-U-Like