2012-03-15 1247 views
1
 -------------------------------------------------------------------------- 
     MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD 
     with errorcode 1. 

     NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. 
     You may or may not see output from other processes, depending on 
     exactly when Open MPI kills them. 
      -------------------------------------------------------------------------- 
      -------------------------------------------------------------------------- 
     mpirun has exited due to process rank 2 with PID 19175 on 
     node mosura15 exiting without calling "finalize". This may 
      have caused other processes in the application to be 
      terminated by signals sent by mpirun (as reported here). 

我正在运行模拟。在MPI命令中,我发现了上述错误。这背后的原因是什么?我该如何解决这个问题?mpirun命令错误

回答

1

它看起来像你的程序的第三个实例(id 2)崩溃,并没有呼吁MPI_Finalize()关闭,所以mpirun也关闭了该程序的所有其他副本。是否有东西导致特定节点崩溃,或每次都是不同的节点?

+0

是的,每次都不一样。 – Kabir 2012-03-15 05:06:38

3

消息很清楚;等级2称为MPI_Abort(),它停止整个程序。你应该能够查看你的代码并找出程序调用MPI_Abort()的错误条件。