next up previous contents index
Next: Linux or Other Unix Up: Running NAMD Previous: Individual Windows, Linux, Mac   Contents   Index

Linux Clusters with InfiniBand or Other High-Performance Networks

Charm++ provides a special verbs network layer that uses InfiniBand networks directly through the OpenFabrics OFED ibverbs library. This avoids efficiency and portability issues associated with MPI. This newer Charm++ verbs network layer replaces the earlier ibverbs layer, providing equivalent performance while also supporting multi-copy algorithms (replicas).

Intel Omni-Path networks are incompatible with the pre-built verbs NAMD binaries. Charm++ for verbs can be built with -with-qlogic to support Omni-Path, but for this case the Charm++ MPI network layer performs better than the verbs layer. Hangs have been observed with Intel MPI but not with OpenMPI, so OpenMPI is preferred. See ``Compiling NAMD'' below for MPI build instructions. NAMD MPI binaries may be launched directly with mpiexec rather than via the provided charmrun script.

Writing batch job scripts to run charmrun in a queueing system can be challenging. Since most clusters provide directions for using mpiexec to launch MPI jobs, charmrun provides a ++mpiexec option to use mpiexec to launch non-MPI binaries. If ``mpiexec -n procs ...'' is not sufficient to launch jobs on your cluster you will need to write an executable mympiexec script like the following from TACC:

  #!/bin/csh
  shift; shift; exec ibrun $*

The job is then launched (with full paths where needed) as:

  charmrun +p<procs> ++mpiexec ++remote-shell mympiexec namd3 <configfile>

Charm++ now provides the option ++mpiexec-no-n for the common case where mpiexec does not accept "-n procs" and instead derives the number of processes to launch directly from the queueing system:

  charmrun +p<procs> ++mpiexec-no-n ++remote-shell ibrun namd3 <configfile>

For workstation clusters and other massively parallel machines with special high-performance networking, NAMD uses the system-provided MPI library (with a few exceptions) and standard system tools such as mpirun are used to launch jobs. Since MPI libraries are very often incompatible between versions, you will likely need to recompile NAMD and its underlying Charm++ libraries to use these machines in parallel (the provided non-MPI binaries should still work for serial runs.) The provided charmrun program for these platforms is only a script that attempts to translate charmrun options into mpirun options, but due to the diversity of MPI libraries it often fails to work.


next up previous contents index
Next: Linux or Other Unix Up: Running NAMD Previous: Individual Windows, Linux, Mac   Contents   Index
http://www.ks.uiuc.edu/Research/namd/