Deprecated: please see new documentation site.

The default setting of processes per node is 8 (ppn=8) on Queenbee and 4 (ppn=4) on other LONI Dell clusters. However, there is a way for a user to achieve ppn=N, where N is an integer value between 1 and the default setting. One basically generates a new machinefile with N entries per node.

The default machinefile $PBS_NODEFILE has a number of entries equal to the default processes per node count for each node allocated. With a little script magic, one can pull out all the unique nodes from this file and put them in a new file with only N entries for each node, then feed this new machine file to mpirun.

The following is a simple example to achieve ppn=2 on Queenbee. Please keep in mind that you will be charged for ppn=8 since only entire nodes are allocated to jobs.

 # Do the usual queue and allocation stuff.
 #PBS -q checkpt
 #PBS -A loni_gridadmin1
 # Specify desired node count and processes per node. Do NOT use ppn = 1. 
 # Only the cluster default ppn value is accepted.
 #PBS -l nodes=16:ppn=8
 # Set the desired amount of CPU time and wall-clock time (should be same).
 #PBS -l cput=2:00:00
 #PBS -l walltime=2:00:00
 # Name of the standard out file to be "output-file", and merge in stderr.
 #PBS -o /work/ou/mpi_openmp/run/output/myoutput2
 #PBS -j oe

 # Start the process of creating a new machinefile by making sure any
 # temporary files from a previous run are not present.

 rm -f tempfile tempfile2 newmachinefile

 # Generating a file with only unique node entries from the one provided
 # by PBS with basically nodes*ppn entries.

 cat $PBS_NODEFILE | sort | uniq >> tempfile

 # Duplicate the content of tempfile N times - here just twice.

 cat tempfile >> tempfile2
 cat tempfile >> tempfile2

 # Sort the content of tempfile2 and store in a new machinefile

 cat tempfile2 | sort > newmachinefile 

 # Set the number of processes from the number of entries.

 export NPROCS=`wc -l newmachinefile |gawk '//{print $1}'`

 # Launch mpirun with the new machinefile and process count.

 mpirun -np $NPROCS -machinefile newmachinefile your_program

Users may direct questions to

Powered by MediaWiki