From: Axel Kohlmeyer (
Date: Thu Aug 18 2011 - 10:39:18 CDT
On Thu, Aug 18, 2011 at 2:25 AM, Norman Geist
<> wrote:
> Hi experts,
> yesterday I have observed some, let’s say unfavorable behavior of namd cuda
> job spawning. I was testing multinode gpu runs when finding out that namd
the fact, that nodes on a cluster are identical is a common and
very valid assumption made by many parallel applications. support
for in-homogeneous machines would make things _much_ more
complicated for very little gain.
> reads the +devices parameter from the beginning at every node, not process,
> just on every node, namd starts to read the devices string from the start
> and so make it impossible to work with different nodes. Even if I have
> nodelist like:
you can try using nvidia-smi to set the GPU that you don't want to use
for namd to "compute disable" mode. never tried it with namd myself
(or needed to do it).
> host c35
> host c35
> host c35
> host c35
> host c33
> host c33
> host c33
> host c33
> And type a device string like +devices 1,1,1,1,0,0,0,0 he will try to use
> the gpu:1 on all processes. Is there any way to influence this without
> hacking the namd source? There have to be a possibility for such things.
> Maybe if I have two nodes, one with a quadro and tesla and the other node
> only with one tesla, and I only want to use the tesla. Or I just don’t want
> all gpus, because I want to run multible jobs on one machine. I already have
> a script that would give the right gpu id for every node and generate such a
> device string. But for that to work, the PEs must determine which gpu to
> bind by their real PE-ID, which works fine for one node, but not with
> multible nodes. That would be better and no big change to the current
> function.
> Please tell me your view.
> Thanks
> Norman Geist
-- Dr. Axel Kohlmeyer Institute for Computational Molecular Science Temple University, Philadelphia PA, USA.
This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:20:43 CST