0
这是我用于重新提交mpi作业的作业脚本。我有最初为tcsh shell编写的脚本。我试图重写它为bash shell,我得到错误。请帮我纠正脚本。mpi在bash shell中重新提交脚本错误
##============================================================================
#!/bin/bash
#PBS -l mem=10GB
#PBS -l walltime=12:00:00
#PBS -l nodes=2:ppn=6
#PBS -v NJOBS,NJOB
if [ X$NJOBS == X ]; then
$ECHO "NJOBS (total number of jobs in sequence) is not set - defaulting to 1"
export NJOBS=1
fi
if [ X$NJOB == X ]; then
$ECHO "NJOB (current job number in sequence) is not set - defaulting to 1"
export NJOB=1
fi
#
# Quick termination of job sequence - look for a specific file
#
if [ -f STOP_SEQUENCE ] ; then
$ECHO "Terminating sequence at job number $NJOB"
exit 0
fi
#
# Pre-job file manipulation goes here ...
# =============================================================================
# INSERT CODE
# =============================================================================
module load openmpi/1.4.3
startnum= 0
x=1
i= $(($NJOB + $startnum - $x))
j= $(($i + $x))
$ECHO "This is job $i"
#$ECHO floobuks.$i.blah
#$ECHO flogwhilp.$j.txt
#===========================================================================
# actual execution code
#===========================================================================
# this is just a sample
echo "job $i is followed by $j"
#===========================================================================
RUN COMPLETE
#===========================================================================
#
# Check the exit status
#
errstat=$?
if [ $errstat -ne 0 ]; then
# A brief nap so PBS kills us in normal termination
# If execution line above exceeded some limit we want PBS
# to kill us hard
sleep 5
$ECHO "Job number $NJOB returned an error status $errstat - stopping job sequence."
exit $errstat
fi
#
# Are we in an incomplete job sequence - more jobs to run ?
#
if [ $NJOB -lt $NJOBS ]; then
#
# Now increment counter and submit the next job
#
NJOB=$(($NJOB+1))
$ECHO "Submitting job number $NJOB in sequence of $NJOBS jobs"
qsub recur2.bash
else
$ECHO "Finished last job in sequence of $NJOBS jobs"
fi
#==============================================================================
我收到以下错误,当我运行
qsub -v NJOBS=4 recur2.bash
ModuleCmd_Load.c(200):ERROR:105: Unable to locate a modulefile for 'openmpi/1.4.3'
/var/spool/PBS/mom_priv/jobs/1833549.epic.SC: line 115: 0: command not found
/var/spool/PBS/mom_priv/jobs/1833549.epic.SC: line 117: 0: command not found
/var/spool/PBS/mom_priv/jobs/1833549.epic.SC: line 118: 1: command not found
/home/nsubramanian/bin/gromacs_3.3.3/bin/grompp_mpi: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such\
file or directory
/var/spool/PBS/mom_priv/jobs/1833549.epic.SC: line 128: mpirun: command not found
我能找出错误的了openmpi,但其余的我不能。我不知道如何使它工作。
注意:请忽略行号,它与原始文件不同。
非常感谢。我在分配时看到了这个空间 – user1492449 2012-07-22 01:17:14