Difference between revisions of "Weighted Histogram Analysis Method"

From AlchemistryWiki
Jump to navigation Jump to search
(Created page with "paceholder!")
 
Line 1: Line 1:
paceholder!
+
{{Fundamentals | cTopic=Theory}}
 +
 
 +
The Weighted Histogram Analysis Method (WHAM) is one of the earliest methods that take into account information from all [[Intermediate States]]. By analyzing all the information at once, we can reduce the number of cycles and loops me must run through, improving efficiency. The precursor to WHAM and first version of multiple histogram relighting techniques came from Ferrenberg and Swendsen;{{cite|Ferrenberg1989|Ferrenberg, A. M., and Swendsen, R. H. (1989) Optimized Monte Carlo Data Analysis. ''Phys. Rev. Lett'' 63, 1195–1198.|http://www.citeulike.org/user/dgront/article/774372}} WHAM was developed later for alchemical simulations{{cite|Kumar1992|Kumar, S., Bouzida, D., Swendsen, R. H., Kollman, P. A., and Rosenberg, J. M. (1992) The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. ''J. Comput. Chem.'' 13, 1011–1021.|http://www.citeulike.org/user/dgront/article/774373}}.
 +
 
 +
WHAM works on the principal that if you have a discrete number of states, you can create a histogram with discrete bins that provide you a relative probability of observing the states of interest, assuming you create the bins along whatever reaction path you have selected. In our cases, the reaction path is the alchemical variable. From these probabilities, you can calculate free energies and other observables.
 +
 
 +
=Derivation=
 +
WHAM's derivation is best done by starting from visualizing the problem. If you are given <math>K</math> number of states, a number of run simulations, <math>S</math>, <math>N_i</math> samples from each simulation, and then <math>K</math> free energies to calculate: <math>A_1,\,A_2,\,\cdots A_K</math>; we now wish to assign the reweighed potentials into bins, <math>B</math>. Consider the [[Theoretic Principals#Core Free Energy Equation|core free energy difference]] equation of
 +
 
 +
:<math>\displaystyle \Delta A_{ij} = -k_B T \ln \frac{Q_j}{Q_i}</math>
 +
 
 +
Instead of solving this directly, we will look at the the probability density, which is the ratio between the two partition functions (i.e. ignore the <math-k_B T \ln</math> part). Recall that the parition function is often written as
 +
 
 +
:<math>\displaystyle Q_j = \int_{V_i} \exp\left(-\beta U \right)</math>,
 +
Q=\int_{V_i}\exp\left(-\beta U \right)\,\mathrm{d}\vec{q}
 +
 
 +
however, this can be misleading as we are not explicitly stating the density of states as the prefactor, <math>\Omega</math>. Accounting for discrete states, adding in the binning values <math>B</math>, we can write the density of states out for a given state, <math>i</math> as:
 +
 
 +
:<math>\displaystyle \Omega_i(\vec{q},\lambda) = B_i(\vec{q},\lambda)\exp\left[\left(\sum_{j=0}^S \beta_i \lambda_{j,i} \vec{q}_j\right)-\beta A_i\right]</math>
 +
 
 +
and the best estimate for the density of states is then
 +
 
 +
:<math>\displaystyle \sum_{i=1}^K \omega_i(\vec{q})\Omega_i(\vec{q},\lambda)</math>
 +
:<math>\displaystyle \sum_{i=1}^K \omega_i(\vec{q})=1</math>
 +
 
 +
The best estimators for <math>\omega_i</math> are the ones that minimize the statistical noise. It turns out then, that the best estimator is the one that minimize the error of <math>B_i</math> for all <math>i</math>. The error of any individual <math>B_i</math> is then
 +
 
 +
:<math>\delta^2 B_i = g_i\left\langle B_i \right\rangle</math>
 +
 
 +
where <math>g_i = 1+2\tau_i</math> and <math>\tau_i</math> is the correlation time of a given simulation. The best estimator for the expectation of the bin value is then
 +
 
 +
:<math>\displaystyle \hat{\left\langle B_i \right\rangle} = N_i\Omega\exp\left(\beta A_i - \beta\sum_{j=0}^S \lambda_{j,i} \vec{q}_j\right) </math>.
 +
 
 +
Please note that <math>\Omega</math> does ''not have a subscript'' in the previous equation.
 +
 
 +
From here you can just substitute back in all the way to <math>Q_j/Q_i</math> to get the final WHAM equation of
 +
 
 +
:<math>\displaystyle \frac{Q_j}{Q_i} = \frac{\sum\limits_{x=1}^K g_x^{-1} B_x \exp\left[-\beta\lambda_0\vec{q}_0 - \beta\sum\limits_{l=1}^S \lambda_l\hat{V}_l(\vec{q}) \right]}{\sum\limits_{y=1}^K g_y^{-1} N_y \exp\left[\beta A_y - \beta\lambda_0\vec{q}_0 - \beta\sum\limits_{m=1}^S \lambda_{m,y}\hat{V}_m(\vec{q}) \right]}</math>
 +
 
 +
where the knot subscript indicates the conditions at the unmodified state. It is then trivial to get a free energy once this ratio is found.
 +
 
 +
==Zero Width Bins==
 +
It is possible to have a bin width of zero with WHAM. Although this is a limiting case, it does clean up the equation a bit and looks like
 +
 
 +
:<math>\displaystyle \hat{A_i} = - \beta^{-1}\ln \sum_{k=1}^K \sum_{n=1}^{N_k} \frac{ \exp[-\beta U_i(\vec{q}_{kn})]}{\sum\limits_{k'=1}^K N_{k'} \, \exp[\beta A_{k'} - \beta U_{k'}(\vec{q}_{kn})]}</math>
 +
 
 +
which turns out to be the exact equation for [[Multistate Bennett Acceptance Ratio|MBAR]].{{cite|Shrits2008|Shirts, M. R., and Chodera, J. D. (2008) Statistically optimal analysis of samples from multiple equilibrium states. ''J. Chem. Phys.'' 129, 129105.|http://www.citeulike.org/user/jamesr/article/3360542}}
 +
 
 +
=Downloading WHAM=
 +
Because WHAM requires solving sets of non-linear equations, it is not recommended for beginners to write their own. Fortunately, many simulation packages, such as GROMACS and CHARMM already include WHAM driven free energy calculations.{{cite|Hub2010|Hub, J. S., de Groot, B. L., and van der Spoel, D. (2010) g_wham–A Free Weighted Histogram Analysis Implementation Including Robust Error and Autocorrelation Estimates. ''J. Chem. Theo. Comput.'' 6, 3713–3720.|http://www.citeulike.org/user/agrossfield/article/8443854}}{{cite|Souaille2001|Souaille, M., and Roux, B. (2001) Extension to the weighted histogram analysis method:
 +
combining umbrella sampling with free energy calculations. ''Comput. Phys. Commun.'' 135, 40–57.|http://www.citeulike.org/user/girayenkavi/article/2910539}}{{cite|Wang2006|Wang, J., Deng, Y., and Roux, B. (2006) Absolute Binding Free Energy Calculations Using Molecular Dynamics Simulations with Restraining Potentials. ''Biophys. J.'' 91, 2798–2814.|http://www.citeulike.org/user/kupopo/article/867499}} There are also other standalone packages available and so there should not be a need to write your own.
 +
 
 +
=References=
 +
<references />

Revision as of 14:19, 25 September 2012



The Weighted Histogram Analysis Method (WHAM) is one of the earliest methods that take into account information from all Intermediate States. By analyzing all the information at once, we can reduce the number of cycles and loops me must run through, improving efficiency. The precursor to WHAM and first version of multiple histogram relighting techniques came from Ferrenberg and Swendsen;[1] WHAM was developed later for alchemical simulations[2].

WHAM works on the principal that if you have a discrete number of states, you can create a histogram with discrete bins that provide you a relative probability of observing the states of interest, assuming you create the bins along whatever reaction path you have selected. In our cases, the reaction path is the alchemical variable. From these probabilities, you can calculate free energies and other observables.

Derivation

WHAM's derivation is best done by starting from visualizing the problem. If you are given [math]\displaystyle{ K }[/math] number of states, a number of run simulations, [math]\displaystyle{ S }[/math], [math]\displaystyle{ N_i }[/math] samples from each simulation, and then [math]\displaystyle{ K }[/math] free energies to calculate: [math]\displaystyle{ A_1,\,A_2,\,\cdots A_K }[/math]; we now wish to assign the reweighed potentials into bins, [math]\displaystyle{ B }[/math]. Consider the core free energy difference equation of

[math]\displaystyle{ \displaystyle \Delta A_{ij} = -k_B T \ln \frac{Q_j}{Q_i} }[/math]

Instead of solving this directly, we will look at the the probability density, which is the ratio between the two partition functions (i.e. ignore the <math-k_B T \ln</math> part). Recall that the parition function is often written as

[math]\displaystyle{ \displaystyle Q_j = \int_{V_i} \exp\left(-\beta U \right) }[/math],

Q=\int_{V_i}\exp\left(-\beta U \right)\,\mathrm{d}\vec{q}

however, this can be misleading as we are not explicitly stating the density of states as the prefactor, [math]\displaystyle{ \Omega }[/math]. Accounting for discrete states, adding in the binning values [math]\displaystyle{ B }[/math], we can write the density of states out for a given state, [math]\displaystyle{ i }[/math] as:

[math]\displaystyle{ \displaystyle \Omega_i(\vec{q},\lambda) = B_i(\vec{q},\lambda)\exp\left[\left(\sum_{j=0}^S \beta_i \lambda_{j,i} \vec{q}_j\right)-\beta A_i\right] }[/math]

and the best estimate for the density of states is then

[math]\displaystyle{ \displaystyle \sum_{i=1}^K \omega_i(\vec{q})\Omega_i(\vec{q},\lambda) }[/math]
[math]\displaystyle{ \displaystyle \sum_{i=1}^K \omega_i(\vec{q})=1 }[/math]

The best estimators for [math]\displaystyle{ \omega_i }[/math] are the ones that minimize the statistical noise. It turns out then, that the best estimator is the one that minimize the error of [math]\displaystyle{ B_i }[/math] for all [math]\displaystyle{ i }[/math]. The error of any individual [math]\displaystyle{ B_i }[/math] is then

[math]\displaystyle{ \delta^2 B_i = g_i\left\langle B_i \right\rangle }[/math]

where [math]\displaystyle{ g_i = 1+2\tau_i }[/math] and [math]\displaystyle{ \tau_i }[/math] is the correlation time of a given simulation. The best estimator for the expectation of the bin value is then

[math]\displaystyle{ \displaystyle \hat{\left\langle B_i \right\rangle} = N_i\Omega\exp\left(\beta A_i - \beta\sum_{j=0}^S \lambda_{j,i} \vec{q}_j\right) }[/math].

Please note that [math]\displaystyle{ \Omega }[/math] does not have a subscript in the previous equation.

From here you can just substitute back in all the way to [math]\displaystyle{ Q_j/Q_i }[/math] to get the final WHAM equation of

[math]\displaystyle{ \displaystyle \frac{Q_j}{Q_i} = \frac{\sum\limits_{x=1}^K g_x^{-1} B_x \exp\left[-\beta\lambda_0\vec{q}_0 - \beta\sum\limits_{l=1}^S \lambda_l\hat{V}_l(\vec{q}) \right]}{\sum\limits_{y=1}^K g_y^{-1} N_y \exp\left[\beta A_y - \beta\lambda_0\vec{q}_0 - \beta\sum\limits_{m=1}^S \lambda_{m,y}\hat{V}_m(\vec{q}) \right]} }[/math]

where the knot subscript indicates the conditions at the unmodified state. It is then trivial to get a free energy once this ratio is found.

Zero Width Bins

It is possible to have a bin width of zero with WHAM. Although this is a limiting case, it does clean up the equation a bit and looks like

[math]\displaystyle{ \displaystyle \hat{A_i} = - \beta^{-1}\ln \sum_{k=1}^K \sum_{n=1}^{N_k} \frac{ \exp[-\beta U_i(\vec{q}_{kn})]}{\sum\limits_{k'=1}^K N_{k'} \, \exp[\beta A_{k'} - \beta U_{k'}(\vec{q}_{kn})]} }[/math]

which turns out to be the exact equation for MBAR.[3]

Downloading WHAM

Because WHAM requires solving sets of non-linear equations, it is not recommended for beginners to write their own. Fortunately, many simulation packages, such as GROMACS and CHARMM already include WHAM driven free energy calculations.[4][5][6] There are also other standalone packages available and so there should not be a need to write your own.

References

  1. Ferrenberg, A. M., and Swendsen, R. H. (1989) Optimized Monte Carlo Data Analysis. Phys. Rev. Lett 63, 1195–1198. - Find at Cite-U-Like
  2. Kumar, S., Bouzida, D., Swendsen, R. H., Kollman, P. A., and Rosenberg, J. M. (1992) The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J. Comput. Chem. 13, 1011–1021. - Find at Cite-U-Like
  3. Shirts, M. R., and Chodera, J. D. (2008) Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys. 129, 129105. - Find at Cite-U-Like
  4. Hub, J. S., de Groot, B. L., and van der Spoel, D. (2010) g_wham–A Free Weighted Histogram Analysis Implementation Including Robust Error and Autocorrelation Estimates. J. Chem. Theo. Comput. 6, 3713–3720. - Find at Cite-U-Like
  5. Souaille, M., and Roux, B. (2001) Extension to the weighted histogram analysis method: combining umbrella sampling with free energy calculations. Comput. Phys. Commun. 135, 40–57. - Find at Cite-U-Like
  6. Wang, J., Deng, Y., and Roux, B. (2006) Absolute Binding Free Energy Calculations Using Molecular Dynamics Simulations with Restraining Potentials. Biophys. J. 91, 2798–2814. - Find at Cite-U-Like