Re: rigidBonds water and langevinHydrogen off (no problem!)

From: Aron Broom (broomsday_at_gmail.com)
Date: Thu Mar 01 2012 - 13:27:59 CST

Dear JC,

Thanks for the reply!

Actually, I did only count the oxygens when calculating the KE for the
water, this is what I meant by "determining the temperature of the
non-rigid parts of the system" in my original message, but I understand
that it's not worded in an extremely clear manner. Same thing when looking
at the rest of the system whenever I was using SHAKE, I ignored all rigid
hydrogens. But you are absolutely correct that counting those without
accounting for the reduced degrees of freedom would be bad. So that part
isn't an issue.

Also, I didn't derive a conversion factor from nothing, I converted the
speeds which are in angstroms/ps to m/s, and the mass from amu to kg and
then used a boltzmann constant of 1.380e-23 J/K in the fitting, so there
was nothing fishy about that, I wasn't just making it work. Perhaps the
units have not changed, I only suggested that for others because in the
tutorial provided the values you would get from following it line-by-line
don't work with the boltzmann constant mentioned in the tutorial.

In terms of the statistical relevance, I completely agree that my 30,000
water oxygens are going to make a good fit to the boltzmann distribution,
whereas 1000-2000 (depending on SHAKE) protein atoms are not going to be as
great, and looking at the fits confirmed this. I did, however measure
those temperatures at 6 separate time points across 60 ns of simulation,
and the standard deviation on the protein was only +/- 5 K, so its being
~20K hotter is statistically meaningful (the standard error in this case is
less than 5K). I should have put the actual error in my original message
to make this more clear.

In your test how large was the system, that is, how many water molecules
and how large of a solute? Also, how long did you simulate for? I failed
to mention in my original message that the system had been pre-equilibrated
with for 20 ns before the 60 ns run, so I imagine it might take some time
to see these effects creep up (that is, the first timepoint that I measured
at was 30 ns into the run effectively), and it might be dependent partially
on the size of the water box. Also, I was using a Langevin-Nose barostat,
although I suspect that wouldn't be a contributing factor, it's worth
pointing out.

I agree completely that one shouldn't state thing prematurely. I feel that
I did a fair amount of testing in this case, and confirmed that everything
I was doing was sensible (and certainly never derived random things to make
the answer fit), but it could easily still be the case that I've missed
something critical, or I had some other setting incorrectly that caused
this. And as I only did 4 ns with the different settings to confirm a
restoration of the proper distribution, that part is not strictly speaking
unassailable statistically.

~Aron

On Thu, Mar 1, 2012 at 1:49 PM, JC Gumbart <gumbart_at_ks.uiuc.edu> wrote:

> Dear Aron,
>
> The statistical mechanics of rigid waters is different from that of
> flexible waters. The energy in the Maxwell-Boltzmann distribution for a
> given water molecule is no longer sum(1/2*mi*vi^2) for all three atoms, but
> instead 1/2*m_total*v_oxygen^2 (similar for other rigid hydrogens and their
> parent atoms). The velocities reported for the hydrogens by NAMD, I
> believe, are meaningless. When you use this formulation, you recover the
> proper distribution and temperature for the system and its components. I
> did find a slight difference in the temperature of the solute and water
> (the solute was about 5% lower), but I don't believe it is significant.
> The water has a lot more particles, and therefore generates more converged
> statistical averages (the tutorial even states that agreement only within
> +/- 10 K is expected).
>
> Also note that the units have not changed in the velocity output by NAMD
> at any time. I suspect you only obtained the proper temperature for the
> full system because you empirically derived a conversion factor that made
> the data fit what you expected.
>
> You might also look at the code that NAMD itself uses to calculate
> temperature, based on the equipartition theorem, noting in particular that
> it removes constrained degrees of freedom.
>
> It is certainly worthwhile to verify for yourself the underpinnings of an
> MD code, but be careful about stating there is something wrong prematurely!
>
> JC
>
> On Feb 28, 2012, at 9:58 PM, Aron Broom wrote:
>
> Dear NAMD users,
>
> I've looked around on the mailing list and couldn't find a clear answer to
> as to whether or not it is supposed to be ok to use rigidBonds water (say
> with TIP3P) but then leave the solute (say a protein) flexible with a 1 fs
> timestep and langevinHydrogen off in order to save on calculating
> collisions with the large number of rigid hydrogens in the sample. It
> might seem like one should not use langevinHydrogen off as long as there
> are any flexible hydrogens, but several sources, including an NAMD tutorial
> (
> http://www.ks.uiuc.edu/Training/Tutorials/science/channel/channel-tutorial-files/ABF/win1/win1A.conf)
> have exactly this setup.
>
> Well, I believe I have confirmed that you should NOT do this. In looking
> at a 60ns simulation of ~2000 protein atoms in an ~100,000 water atom box,
> and determining the temperature of the non-rigid parts of the system by
> calculating their kinetic energies (KE=0.5mv^2), binning them, and then
> fitting to the boltzmann distribution ( y = (2/sqrt(Pi * (KbT)^3 )) *
> sqrt(x) * exp(-x/KbT) ), I find that the temperature of the whole system is
> fine (set to 298 K, and comes out on average at 297 K), but the temperature
> of the protein alone is ~20-25 K higher than the surrounding solvent
> throughout the simulation.
>
> This is a big problem, especially for studying ligand binding, and I
> imagine for anything having a 20-25 K difference across the solvent-solute
> interface is bad, not to mention just being incorrect.
>
> I tested running an extra 4ns from the end of my 60 ns, with two changes
> in NAMD parameters:
>
> 1) just turn langevinHydrogen on, which comes with a 5% performance
> penalty, and
>
> 2) leave langevinHydrogen off, but change rigidBonds water to rigidBonds
> all, and change from a 1 fs timestep to a 2 fs timestep, in which you gain
> ~30% performance, but some people are sceptical of the results.
>
> Both of these changes alone are capable of bringing the system back to a
> reasonable configuration in which the protein and solvent have
> approximately the same temperatures (the protein was still slightly hotter
> (~5 K, but perhaps a gamma of 1/ps is not fast enough to see complete
> equilibrium in 4ns).
>
> Has anyone else seen this and can confirm that the setup of using
> ridigBonds water and langevinHydrogen off are not to be used together when
> you have a solute that has hydrogens, despite this being somewhat common
> practice? If anyone wants to test the temperatures of various parts of
> their systems from existing *.vel files, I just followed the instructions
> on this page (
> http://www.ks.uiuc.edu/Training/Tutorials/namd/namd-tutorial-unix-html/node13.html),
> but note that the units in the .vel files must changed since this was
> written, so you'll have to convert amu to kg, and angstrom/ps to m/s, and
> then use the correct boltzmann constant for those units.
>
> ~Aron
>
>
> --
> Aron Broom M.Sc
> PhD Student
> Department of Chemistry
> University of Waterloo
>
>
>

-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:16 CST