2014 Workshop Tuesday Lunchtime Discussion

From AlchemistryWiki
Jump to navigation Jump to search

Tuesday Lunch Discussion: Making free energy calculations robust.

Can we develop post-simulation health reports? Did things work the way we thought? Are we getting the right answers for the right reasons?

Discussion Leaders: John Chodera and Robert Abel

  • We should get the same answers if we run from different programs
  • Deviation from crystal structures should be looked at
    • There is a database of ground ligands: But crystal image quality needs to be checked first,
    • Although this gets ahead of did we check to see if the simulation worked right in the first place (not validated)
  • e.g. if you ran with very few counter ions (just enough to balance system), you would probably undersample ion positions
    • Solution: don't run with only limited number of ions, use more
      Do we worry that protein is at higher concentration in sim than sol?
  • Related to box size dependence
  • Ligand will also be affected since we really are not at infinite solution

We have a Draft Standards on Alchemistry.org?

  • This was a brain dump and we encourage people to contribute
  • We should script this process and run our checks
  • We may think about generating a pretty report similar to medical health reports

Before you can do good FEP, you need to do good simulations

A health check was made in someone's company where they eyeballed results

Human judgment got better after the start of project

How can we use previous results to get a sense of what is wrong?

What parts of active site move for ligands in congenerc series?

If part of protein we had not seen move suddenly moved before changes conformations, this is a red flag

We would like to have simulations where we see things happen a number of times or never

  • If we see something occur but not frequently (slow) such as loop opening, this should raise a red flag
  • Look at number of rotamer transitions, lots good, one or few bad.
  • Might still have events that are NOT coupled to the binding site, so are they important or can they be ignored?
  • We could use Markov model to detect the slow degrees of freedom
  • Could also look at correlation of slow degrees of freedom and [math]\displaystyle{ \partial U / \partial \lambda }[/math]

Detecting ergodicity problems with simulations vs. gross errors vs. setup errors

Some pipeline that could take PDB and spin up short/single-frame sims from multiple packages and compare. This will catch about 80% of errors

People can look for consistency in thermodynamic pathways

  • Building redundancy into workflows to add another layer of error checking
  • Replicating the same simulation and varying what should not mater (e.g, different initial velocities) We should get the same result, but we often don't
    • Initial binding modes, water placement
    • Danny used slightly different preps of same protein
  • Could run multiple reruns from of the simulation to get error bars (3-5 times)
    • Use different starting points

How much does this actually block/interrupt people

  • Small typos, can cause 1/2 sims. to fail
  • There are an infinite number of ways things can go wrong
  • Clear to people who are at it the 10th time, harder for 1st time users like people learning
  • Also problematic for people who much later on get bored and try to tweak something
  • [math]\displaystyle{ \Delta F = 0 }[/math] for FE of solution to solution from bad start point
  • Having the students failing actually teaches them, so it may be useful
  • But this may dissuade broader community who don't want to deal with it
  • Even with checks in the sowftware, for cases when they set a parameter physically wrong but software accepts, what then?
    • There is a very subtle line between how much checking to do, and when you should have a feel for it
    • Maybe setting an "Expert flag?"
  • Do we want broader community to fiddle with fine params. like the Barrostats?
  • Having the idea of default settings such as Docking (or TurboTax/TT)
    • TT: feed a #, most users with this # do this.
    • A similar prompting system may be helpful for people
      • Some kind of default progression path
      • Maybe Teirs of interfaces like Pheonix for crystallography
  • Tooltips will be useful in the setup

Equilibration and correlated samples

  • Running longer: You may be bias to stopping if you see an estimation you expect

Things if people don't do, you get annoyed

  • Not varying initial conditions
    • Bound vs unbound starting structure
  • Different binding mode
  • Reverse convergence of DeltaG with time
  • Energy/volume Drift, energy change fluctuations
    • Using the right method (e.g. can't use Zwanzig formula if work distro too large)
  • RMSD of protein and ligand
    • Some X is bad? or some plateau is okay? we dont know.
  • Were motions we expected to sample actually sampled (e.g. loops)
  • Replica exchange convergence
    • If individual replicas (which should visit all the states) converge to the same answer as all data together, that converges
  • Time autocorrelation of the property we are interested in.
    • w/o we have no idea of the statistical hygiene

We need a common place to converge

We can also check our convergences against experiment if we have the resources for it We need to have best practices we all check, and things we need to check for our users.

Secondary Notes

Are we getting the right answers for the right reasons?

  • do we get the same answer from different programs? different people? different protocols?
  • deviation from X-ray structures may indicate problems? how to quantify?
  • evaluate crystal quality beforehand, especially ligand density; automated?
  • insufficient number of counterions; poor counterion sampling/decorrelation
  • what parts of active site move for ligands in a congeneric series? if a part of the protein we hadn’t seen move before changes conformation, that is potentially a red flag
  • infrequent events are bad for convergence; how can we detect them? number of rotamer transitions? MSMs? correlations with dV/dlambda?
  • detecting ergodicity problems with simulations we’ve run vs. gross errors vs. setup errors
  • compare different community simulation pipelines for same input?
  • consistency in thermodynamic pathways / cycle closure / redundancy
  • replication of the same simulation, varying things we think don’t matter (e.g. velocities, initial binding modes, water placement, slightly different preps of same protein, bound vs unbound structure); check consistency
  • error bar estimates from multiple replicates of same simulation
  • “TurboTax” style warnings and sensible default recommendations
  • “Phenix”-style choices for modeling, tailoring options to user
  • tooltips are useful for users in setup
  • reverse convergence of DeltaG with time
  • energy drift or large fluctuations; check energy conservation
  • using right method (e.g. can’t use Zwanzig formula if work distribution too large)
  • RMSD of protein and ligand
  • were motions we expected to sample actually sampled? (e.g. loops)
  • time autocorrelation of property you are interested in (simple statistical hygiene)
  • need at least N independent data points to have some idea of how things are behaving