2011-01-20 310 views
0

这是我在执行MPI_Finalize()时第一次发生错误。我认为沟通是造成这个问题的原因,但我不知道是什么让它发生。MPI:执行MPI_Finalize()时发生错误

当我在1个处理器运行它,它工作正常,但在2级或更多的处理器,我得到一个分段错误..

的错误消息是

[seismicmstm:32604] *** Process received signal *** 
[seismicmstm:32604] Signal: Segmentation fault (11) 
[seismicmstm:32604] Signal code: (128) 
[seismicmstm:32604] Failing at address: (nil) 
[seismicmstm:32604] [ 0] /lib64/libpthread.so.0 [0x311c60eb10] 
[seismicmstm:32604] [ 1] /usr/local/openmpi-1.4.2/lib/libopen-pal.so.0(opal_memo ry_ptmalloc2_int_malloc+0x2f4) [0x2b6955551794] 
[seismicmstm:32604] [ 2] /usr/local/openmpi-1.4.2/lib/libopen-pal.so.0 [0x2b6955 553543] 
[seismicmstm:32604] [ 3] /lib64/libc.so.6(__libc_calloc+0x330) [0x311ba74bc0] 
[seismicmstm:32604] [ 4] /lib64/ld-linux-x86-64.so.2 [0x311b609d65] 
[seismicmstm:32604] [ 5] /lib64/ld-linux-x86-64.so.2 [0x311b605a9c] 
[seismicmstm:32604] [ 6] /lib64/ld-linux-x86-64.so.2 [0x311b6076e1] 
[seismicmstm:32604] [ 7] /lib64/ld-linux-x86-64.so.2 [0x311b610bb6] 
[seismicmstm:32604] [ 8] /lib64/ld-linux-x86-64.so.2 [0x311b60ce06] 
[seismicmstm:32604] [ 9] /lib64/ld-linux-x86-64.so.2 [0x311b6105bc] 
[seismicmstm:32604] [10] /lib64/libc.so.6 [0x311bb08df0] 
[seismicmstm:32604] [11] /lib64/ld-linux-x86-64.so.2 [0x311b60ce06] 
[seismicmstm:32604] [12] /lib64/libc.so.6(__libc_dlopen_mode+0x47) [0x311bb08f57 ] 
[seismicmstm:32604] [13] /lib64/libpthread.so.0 [0x311c60f1dc] 
[seismicmstm:32604] [14] /lib64/libpthread.so.0 [0x311c60f2f0] 
[seismicmstm:32604] [15] /lib64/libpthread.so.0(__pthread_unwind+0x40) [0x311c60 d160] 
[seismicmstm:32604] [16] /lib64/libpthread.so.0 [0x311c607985] 
[seismicmstm:32604] [17] /usr/local/openmpi-1.4.2/lib/openmpi/mca_btl_openib.so [0x2b695869d22b] 
[seismicmstm:32604] [18] /lib64/libpthread.so.0 [0x311c60673d] 
[seismicmstm:32604] [19] /lib64/libc.so.6(clone+0x6d) [0x311bad3f6d] 
[seismicmstm:32604] *** End of error message *** 
-------------------------------------------------------------------------- 
mpirun noticed that process rank 0 with PID 32604 on node seismicmstm.cluster exited on signal 11 (Segmentation fault). 
-------------------------------------------------------------------------- 

所有我做代码正在散播,收集和广播数据。 谁能告诉我如何调试它....

+3

我们需要真正看到你的代码。你能保证你的集体交流实际上正确地传递了数据吗? – chrisaycock 2011-01-20 21:41:49

回答

0

有两个可能的原因:1 )您MPI_Finalize不好:检查MPI库工作正常运行如下CPI的示例代码包含在MPI分配。如果您无权访问发行版,您可以下载tar文件并提取CPI代码或从网上下载任何简单的Hello World应用程序。我强烈建议http://www.citutor.org/如果示例代码正常工作,那么您的MPI库很好,而且你的代码是错误的。否则,图书馆工作不正常。下载你选择的实现并编译另一个副本。

2)的代码不MPI_Finalize但MPI_Finalize在什么地方死亡(段错误)。你能否确认段错误发生在MPI_Finalize中而不是之前?

相关问题