From: Leandro Martínez (
Date: Fri Aug 25 2006 - 07:42:28 CDT

Hi all,
I'm running a simulation with NAMD_2.6b2_Linux-amd64-TCP on
a cluster of nine Athlon64 nodes (each processor has a dual
core, so there are actually 18 processors). I'm having some
strange problems with simulations I have already ran on several
other machines, and I'm not being able to find a solution.
Basically I start running the simulation and eventually it either
stops without printing any error message or it eventually starts running
on only one processor apparently. The only message I have
observed to be different from our previous runs is this one:

Info: Adjusted background load on 11 nodes.

That is printed the first time load balancing is performed. The
error does not occur necessarily after that, on the other hand,
but that may be part of the problem, since the simulation was
set to be running on 18 processors (9 nodes).

The only time I got an error message it was the one below, as you
may note was printed after a quite long simulation time.
The error is not easily reproducible, since it happens always
but not every time at the same point of the simulation.
Any help or idea will be appreciated.

ENERGY: 644800 804.7671 2363.3700 1332.0255
-201929.9812 17508.6136 0.0000 0.0000 32575.8361
-147213.3846 297.3932 -147116.7637 -147117.4476 296.8970

Stack Traceback:
  [0] /lib64/ [0x360b32f7c0]
  [1] _ZN17ComputeHomeTuplesI8BondElem4bond9BondValueE10loadTuplesEv+0x4cc
  [2] _ZN17ComputeHomeTuplesI8BondElem4bond9BondValueE6doWorkEv+0x5c4
  [3] _ZN11WorkDistrib12enqueueBondsEP12LocalWorkMsg+0x16 [0x727b16]
  [5] CkDeliverMessageFree+0x21 [0x785aab]
  [6] _Z15_processHandlerPvP11CkCoreState+0x455 [0x7850b5]
  [7] CsdScheduleForever+0xa2 [0x7f1752]
  [8] CsdScheduler+0x1c [0x7f1350]
  [9] _Z10slave_initiPPc+0x10 [0x4bb034]
  [10] _ZN7BackEnd4initEiPPc+0x28f [0x4bb019]
  [11] main+0x47 [0x4b697f]
  [12] __libc_start_main+0xf4 [0x360b31d084]
  [13] _ZNSt8ios_base4InitD1Ev+0x42 [0x4b2c9a]

