2012-07-21 66 views
0

这是我用于重新提交mpi作业的作业脚本。我有最初为tcsh shell编写的脚本。我试图重写它为bash shell,我得到错误。请帮我纠正脚本。mpi在bash shell中重新提交脚本错误

##============================================================================ 

#!/bin/bash                                                
#PBS -l mem=10GB                                 
#PBS -l walltime=12:00:00                              
#PBS -l nodes=2:ppn=6                                                            
#PBS -v NJOBS,NJOB 

if [ X$NJOBS == X ]; then 
    $ECHO "NJOBS (total number of jobs in sequence) is not set - defaulting to 1" 
    export NJOBS=1 
fi 

if [ X$NJOB == X ]; then 
    $ECHO "NJOB (current job number in sequence) is not set - defaulting to 1" 
    export NJOB=1 
fi 

#                                    
# Quick termination of job sequence - look for a specific file                     
#                                    
if [ -f STOP_SEQUENCE ] ; then 
    $ECHO "Terminating sequence at job number $NJOB" 
    exit 0 
fi 

#                                    
# Pre-job file manipulation goes here ...                          
# =============================================================================                                    
# INSERT CODE    
# ============================================================================= 

module load openmpi/1.4.3 

startnum= 0 
x=1 
i= $(($NJOB + $startnum - $x)) 
j= $(($i + $x)) 

$ECHO "This is job $i" 
#$ECHO floobuks.$i.blah                               
#$ECHO flogwhilp.$j.txt                               


#=========================================================================== 
# actual execution code                              
#===========================================================================     

# this is just a sample 
echo "job $i is followed by $j" 

#=========================================================================== 
RUN COMPLETE 
#=========================================================================== 

# 
# Check the exit status 
# 
errstat=$? 
if [ $errstat -ne 0 ]; then 
    # A brief nap so PBS kills us in normal termination 
    # If execution line above exceeded some limit we want PBS 
    # to kill us hard 
    sleep 5 
    $ECHO "Job number $NJOB returned an error status $errstat - stopping job sequence." 
    exit $errstat 
fi 

# 
# Are we in an incomplete job sequence - more jobs to run ? 
# 
if [ $NJOB -lt $NJOBS ]; then 


# 
# Now increment counter and submit the next job 
# 
    NJOB=$(($NJOB+1)) 
    $ECHO "Submitting job number $NJOB in sequence of $NJOBS jobs" 
    qsub recur2.bash 
else 
    $ECHO "Finished last job in sequence of $NJOBS jobs" 
fi 

#============================================================================== 

我收到以下错误,当我运行

qsub -v NJOBS=4 recur2.bash 



ModuleCmd_Load.c(200):ERROR:105: Unable to locate a modulefile for 'openmpi/1.4.3' 
/var/spool/PBS/mom_priv/jobs/1833549.epic.SC: line 115: 0: command not found 
/var/spool/PBS/mom_priv/jobs/1833549.epic.SC: line 117: 0: command not found 
/var/spool/PBS/mom_priv/jobs/1833549.epic.SC: line 118: 1: command not found 
/home/nsubramanian/bin/gromacs_3.3.3/bin/grompp_mpi: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such\ 
file or directory 
/var/spool/PBS/mom_priv/jobs/1833549.epic.SC: line 128: mpirun: command not found 

我能找出错误的了openmpi,但其余的我不能。我不知道如何使它工作。

注意:请忽略行号,它与原始文件不同。

回答

1

在你的系统上没有openmpi/1.4.3这样的模块;并在这些行中

startnum= 0 
i= $(($NJOB + $startnum - $x)) 
j= $(($i + $x)) 

等号后面不应该有空格。

您只需要尝试在bash shell中逐行运行脚本即可。

+0

非常感谢。我在分配时看到了这个空间 – user1492449 2012-07-22 01:17:14