Multistate Bennett Acceptance Ratio

From AlchemistryWiki
Jump to: navigation, search



The Multistate Bennett Acceptance Ratio (MBAR)[1] is a direct extension to BAR as it allows for assessing data from all states, and predicting the free energy at an unsampled state. MBAR reduces to BAR in the limit that only two states are sampled. This equation of free energy calculations can also be seen as a zero-width bin WHAM.

Much like WHAM, the free energies provided by this method are only a statistical estimator, however, MBAR has been shown to have the lowest variance estimator to date.

Contents

Derivation

MBAR is derived from a set of $ K \times K $ weighting functions, $ \alpha_{i,j}(\vec{q}) $, that minimized the variance during the reweighting across the board. Starting from our core free energy equation, we have

$ \displaystyle\Delta A_{ij} = -\beta^{-1} \ln\frac{Q_j}{Q_i} $

We can also manipulate the same identity Bennett used when starting his derivation[2] and then a one extra bit to come up with the relation

$ Q_i\left\langle\alpha_{ij}\exp\left(-\beta U_j\right)\right\rangle_i = Q_j\left\langle\alpha_{ij}\exp\left(-\beta U_i\right)\right\rangle_j $

Although this may not seem like much, it does allow us to write out the following:

$ \displaystyle \sum\limits_{j=1}^K \frac{\hat{Q_i}}{N_i} \sum\limits_{n=1}^{N_i} \alpha_{ij}\exp\left(-\beta U_j(\vec{q}_{i,n})\right) = \sum\limits_{j=1}^K \frac{\hat{Q_j}}{N_j} \sum\limits_{n=1}^{N_j} \alpha_{ij}\exp\left(-\beta U_i(\vec{q}_{j,n})\right) $

assuming we use the empirical estimator for the expectation values of $ \left\langle g \right\rangle_i = N_i^{-1}\sum_{n=1}^{N_i}g(\vec{q}_{i,n}) $

Choosing the optimal $ \alpha_{ij} $ can be done by looking through the literature at extended bridge sampling.[3] We then get an $ \alpha_{ij} $ of:

$ \displaystyle \alpha_{ij} = \frac{N_j \hat{c_j}^{-1}}{\sum\limits_{k=1}^K N_k \hat{c_k}^{-1} \exp\left(-\beta U_k \right)} $.

After making all the necessary substitutions, we can get an expression for an estimated free energy of:

$ \displaystyle \hat{A_i} = -\beta^{-1} \ln \sum\limits_{j=1}^K \sum\limits_{n=1}^{N_j} \frac{\exp\left[-\beta U_i\right]}{\sum_{k=1}^K N_k \exp\left[\beta\hat{A_k} - \beta U_k\right]} $.

One of the first things you should notice is that we have a single free energy, not a difference. This is not a typo, but the free energies for a given set of states is only uniquely determined up to an additive constant. Because of this, one free energy must be taken in reference and thus we are once again calculating free energy differences.

Reduced potential

It is important to note that the $ U $ that appears in the MBAR derivation and equation is not only valid for potential energies, but any generalized/reduced potential as a function of pressure, volume, chemical potential, and number of particles. For example, in a general form, we can take some subset of the additive terms in the following to define the reduced potential $ u_i(x) $ for thermodynamic state $ i $:

$ \displaystyle u_i(x) \equiv \beta_i [ U_i(x) + p_i V(x) + {\bf \mu}_i^\mathrm{T} {\bf N}(x) ] $

The reduced potential function is uniquely defined by some combination of thermodynamic parameters $ \beta_i $ denoting the inverse temperature, $ U_i(x) $ denoting the potential energy function, $ p_i $ denoting the pressure, and $ {\bf \mu}_i $ denoting the chemical potential of one or more components of the system. These latter two thermodynamic variables are conjugate to the box volume $ V(x) $ and particle numbers $ {\bf N}(x) $. Use of the reduced potential simplifies the MBAR equations and generalizes them to the computation of arbitrary reduced free energy differences $ \Delta f_{ij} \equiv \beta_j A_j - \beta_i A_i $ among states.

$ \displaystyle \hat{f_i} = -\ln \sum\limits_{j=1}^K \sum\limits_{n=1}^{N_j} \frac{\exp\left[-u_i(x_{nj})\right]}{\sum_{k=1}^K N_k \exp\left[\hat{f_k} - u_k(x_{nj})\right]} $.

In the sum above $ x_{nj} $ indicates the $ n $th sample from the $ j $th state. Interestingly, in this formula, it actually doesn't matter which sample comes from which state, so it can be rewritten as a sum over all $ N = \sum_{j=1}^K N_j $, so that we have: $ \displaystyle \hat{f_i} = -\ln \sum\limits_{n=1}^{N} \frac{\exp\left[-u_i(x_{n})\right]}{\sum_{k=1}^K N_k \exp\left[\hat{f_k} - u_k(x_{n})\right]} $

Finally, one can rewrite this as: $ \displaystyle \hat{f_i} = -\ln \left\langle \frac{\exp\left[-u_i(x_{n})\right]}{\sum_{k=1}^K \frac{N_k}{N} \exp\left[\hat{f_k} - u_k(x_{n})\right]}\right\rangle $

Where the ensemble averaged over is the mixture ensemble that consist of samples taken from each state $ i $ $ \frac{N_i}{N} $th of the time.

Remember always the free energy will change depending on which reduced potential you use, so please take this into account when working with MBAR.

Estimating Free Energies with MBAR

MBAR provides the direct equation to find free energies as show above assuming you have the energies. This can be rather difficult to implement for beginners since this does require iterative solutions like BAR (note: see the free, Python implementation below). Despite this, MBAR still has the lowest variance of all the other methods listed under the theory section of the fundamentals. Further, MBAR has a direct way to calculate errors (see paper[1] for derivation).

You will need a complete set of $ \Delta U_{k,j} $ for all $ K $ states for MBAR to work, just like you do in WHAM. Fortunately, you can do most of this in post processing if need be. Once again, you do not need to calculate $ \Delta U_{k,k} $, but it is still a good self check to ensure the reweighting method is working correctly; all the remaining checks for WHAM hold as well and should be followed when analyzing data.

Download MBAR

MBAR may seem like a daunting set of equations to program yourself, so the authors have provided a Python implementation of MBAR for anyone to use, free of charge at http://github.org/choderalab/pymbar, with a number of examples for a variety of situations at http://github.org/choderalab/pymbar-examplesThe software comes with examples and uses cases. Also bundled with it are the tools to compute expectation values and an implementation of BAR.

References

  1. 1.0 1.1 Shirts, M. R., and Chodera, J. D. (2008) Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys. 129, 129105. - Find at Cite-U-Like
  2. Bennett, C. H. (1976) Efficient Estimation of Free Energy differences from Monte Carlo Data. J. Comput. Phys. 22, 245–268. - Find at Cite-U-Like
  3. Tan, Z. (2004). On a Likelihood Approach for Monte Carlo Integration. J. Am. Stat. Assoc., 99(468), 1027–1036.