From: Aron Broom (broomsday_at_gmail.com)
Date: Thu Jun 14 2012 - 13:26:03 CDT
I'm running multiple walker MetaDynamics, and for a few of the replicas,
after a random period of time, the run crashes with the following error:
terminate called after throwing an instance of 'std::ios_base::failure'
what(): basic_filebuf::underflow error reading the file
/var/spool/torque/mom_priv/jobs/4378.mon240.monk.sharcnet.SC: line 3: 30588
Aborted
(core dumped) ../../../../NAMD/NAMD_2.9_Linux-x86_64-multicore-CUDA/namd2
+p4 +idlepoll +mergegrids Galactose_Meta_Run.namd
I suspect the last two lines are rather meaningless, but I included them
for completeness. I'm not sure, but I think this results when replica A is
attempting to read the hills from replica B while replica B is adding new
hills, or alternatively when two replicas are trying to read hills from
another replica at the same time. If that is the case, then I suppose
losing some synchronization between the replicas by increasing the time
between updates might help. But I'd ideally like to avoid that, and was
wondering if maybe this is a hardware or operating system specific
problem?
Thanks,
~Aron
-- Aron Broom M.Sc PhD Student Department of Chemistry University of Waterloo
This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:39 CST