Using Your Warewulf Cluster
This exercise should be done while logged in as a normal user, not
as root.  The CentOS installation already helped you create one, but you
can also create a normal user account with the command "useradd
username" and then set the password with
"passwd username".
Part 1: Run NAMD
NAMD is a parallel molecular dynamics application developed in our
group.  It is the main application run on our clusters.
  -  Copy the files NAMD_2.6b1_Linux-i686.tar.gz (NAMD binary)
       and apoa1.tar.gz (sample NAMD simulation)
       from the workshop CD and untar them in your home directory with:
tar xzf apoa1.tar.gz
tar xzf NAMD_2.6b1_Linux-i686.tar.gz
 
 
-  cd NAMD_2.6b1_Linux-i686 
-  Use a text editor to create the file nodelist containing:
group main
  host node0000
  host node0001
  host node0002
 
 The nodelist file tells NAMD what nodes to run on.  When we run
      under the queueing system below we'll use a script to create this
      file.
-  Start NAMD on all three machines with:
./charmrun ++remote-shell /usr/bin/rsh ++nodelist nodelist +p3 ./namd2 ~/apoa1/apoa1.namd
 
 If you have problems, or want to see what's going in in the launch
          process, add ++verbose to the charmrun command
	  line.
-  When NAMD reaches the line that says "TIMING 20 ..." kill it with
       Control-C and jot down the wallclock s/step number. 
-  Run NAMD again on two processors (change +p3 above to +p2) for
       20 steps and compare the performance between the two.  Do three 
       processors run three times as fast as one?  How close to three
       times? 
 Note: rsh is disabled on the master node by default for
       security reasons, otherwise we could use it as a fourth
       processor.  Tachyon (used next) works with all four simply
       because it does not depend on the use of rsh for
       communication.
 Part 2: Compile and Run Tachyon
Tachyon is a parallel ray tracer developed by John Stone for his
master's thesis.  It is an example of a typical MPI application.
  -  Copy the file tachyon-0.97.tar.gz (Tachyon source and examples)
       from the workshop CD and untar them in your home directory with:
tar xzf tachyon-0.97.tar.gz
 
 
-  cd tachyon/unix
-  Use a text editor to open the file Make-arch 
-  Search for the config options for "linux-lam:" 
-  Copy this set of options to a new entry. 
-  Change (in the new entry) linux-lam to linux-mpich 
-  Change "CC = hcc" to "CC = gcc" 
-  Change -I$(LAMHOME)/h to -I/opt/mpich/include 
-  Change -L$(LAMHOME)/lib to -L/opt/mpich/lib 
-  Change -lmpi to -lmpich 
-  Save, quit the editor and run "make linux-mpich"
       to build tachyon.  If this doesn't work you probably missed
       on of the edits above, or applied them in the wrong place.
       The tachyon binary will end up in compile/linux-mpich/. 
-  cd (back to your home directory) 
-  Use a text editor to create the file machines containing:
hostname
node0000
node0001
node0002
 
 
-  Run Tachyon on the three slave machines with:
/opt/mpich/bin/mpirun -v -np 4 -machinefile machines \
  tachyon/compile/linux-mpich/tachyon +V tachyon/scenes/balls.dat
 
 
-  Look at the timing output, which is broken into different
       stages of the calculation.  Run on one, two, and three processors
       (change -np 4 to number of processors) and calculate
       speedups for the different stages as well as the total time.
       
Part 3: Run Under Grid Engine
Sun Grid Engine (SGE) is a free, open souce, general purpose,
cross platform queueing system.  In the geneology of queueing systems,
it is a descendant of the free DQS package, which was commercialized
by a German company that was recently bought by Sun.
  -  Run "qstat -f" to see the queues that were automatically
       created.  There should be one queue for each compute node.
       The states column at far right is used for error flags. 
-  Use a text editor to create the file tachyon.job containing:
#$ -cwd
#$ -j y
/opt/mpich/bin/mpirun -v -np $NSLOTS -machinefile $TMPDIR/machines \
  tachyon/compile/linux-mpich/tachyon +V tachyon/scenes/balls.dat
 
 Notice the similarity to the command for running Tachyon
       manually.  SGE will create a temporary working directory
       containing a machines file (list of nodes to run on) and set the
       NSLOTS and TMPDIR environment variables automatically.  The
       options preceeded by #$ are parsed by SGE as if they were
       specified on the command line.  -cwd causes the job to execute in
       the current working directory.  -j y merges standard error and
       output into a single file.
-  Submit the job to run on three processors under the mpich
       parallel environment with the command "qsub -pe mpich 3
       tachyon.job".  Note that there is no queue for the master
       node, so we can't use 4 nodes. 
-  Use "qstat -f" to check on the job until it is scheduled,
       then look for output files named tachyon.job.oX and
       tachyon.job.poX, where X is the job number output by qsub.  View
       these files to see the output. 
-  Submit several jobs requesting 1, 2, and 3 processors in random
       order so that a backlog develops.  You can use the same
       tachyon.job file for all of them, just use the up arrow, possibly
       edit the processor request, and hit return to submit jobs
       quickly.) Use qstat to monitor how the jobs are executed (the
       default scheduling policy is to take the earliest-submitted job
       that can be run, i.e., for which enough processors are available,
       and the scheduler runs at regular intervals). 
-  Use a text editor to create the file namd.job containing:
#$ -cwd
#$ -j y
nodefile=$TMPDIR/namd2.nodelist
echo group main > $nodefile
awk '{ for (i=0;i<$2;++i) {print "host",$1} }' $PE_HOSTFILE >> $nodefile
dir=$HOME/NAMD_2.6b1_Linux-i686
$dir/charmrun ++remote-shell /usr/bin/rsh ++nodelist $nodefile +p$NSLOTS $dir/namd2 ~/apoa1/apoa1.namd
 Since NAMD does not use MPICH, we need a small shell script
and awk program to translate the SGE hostfile to charmrun format.
The second column of the hostfile is the number of processors available,
which is always one for these clusters, but this script will handle more.
-  Submit the job with the command "qsub -pe make 3 namd.job".
       Note that we are pretending to use the makeparallel
       environment, but we do not use any of the special files it sets
       up.
-  Use qstat to monitor the job until it starts running, the use
       "tail -f namd.job.oX (X is the job number) to watch the
       job output. 
-  When you get tired of this, Control-C out of tail and use
       "qdel X" (X is the job number) to kill the job.  Use qstat
       to monitor the job until it is killed. 
Part 4: There Is No Part 4
 Compiling a program and running it under a queueing system is likely
    all you will ever do on your cluster.  We've done a typical
    application (Tachyon) and a not-so-typical one (NAMD).  At this
    point you might want to rsh to a compute node to see what that
    environment is like, to try the pretty graphical cluster tools, or
    go see how the Clustermatic folks are doing.  If you're really
    ambitious, download your own code and see if it comiles and runs.
    
See Also
 Warewulf web site (http://www.warewulf.org/) 
 Grid Engine web site (http://gridengine.sunsource.net/) 
 NAMD web site (http://www.ks.uiuc.edu/Research/namd/) 
 Tachyon web site (http://jedi.ks.uiuc.edu/~johns/raytracer/)