2011-11-04 419 views
1

EDIT#1:MPI_Gatherv,在MPI_Gatherv致命错误:未决的请求(无差错),错误堆栈:

所以溶液

线

MPI_Gatherv(buffer, rank, MPI_INT, buffer, receive_counts, receive_displacements, MPI_INT, 0, MPI_COMM_WORLD); 

具有改为

MPI_Gatherv(buffer, receive_counts[rank], MPI_INT, buffer, receive_counts, receive_displacements, MPI_INT, 0, MPI_COMM_WORLD); 

再次感谢你的帮助


原贴:

我的代码是从DeinoMPI

当我运行mpiexec的-localonly 4 skusamGatherv.exe,everithing是OK

如果我改变线

INT receive_counts [4] = {0,1,2,3 };

INT receive_counts [4] = {0,1,2,};

编译仍然是好的,但是当我运行mpiexec的4 -localonly我skusamGatherv.exe会得到错误

我的东西它想工作

感谢您的帮助


我会收到错误:

Fatal error in MPI_Gatherv: Message truncated, error stack: 
MPI_Gatherv(363)........................: MPI_Gatherv failed(sbuf=0012FF4C, scou 
nt=0, MPI_INT, rbuf=0012FF2C, rcnts=0012FEF0, displs=0012FED8, MPI_INT, root=0, 
MPI_COMM_WORLD) failed 
MPIDI_CH3_PktHandler_EagerShortSend(351): Message from rank 3 and tag 4 truncate 
d; 12 bytes received but buffer size is 4 
unable to read the cmd header on the pmi context, Error = -1 
. 
0. [0][0][0][0][0][0] , [0][0][0][0][0][0] 
Error posting readv, An existing connection was forcibly closed by the remote ho 
st.(10054) 
unable to read the cmd header on the pmi context, Error = -1 
. 
Error posting readv, An existing connection was forcibly closed by the remote ho 
st.(10054) 
1. [1][1][1][1][1][1] , [0][0][0][0][0][0] 
unable to read the cmd header on the pmi context, Error = -1 
. 
Error posting readv, An existing connection was forcibly closed by the remote ho 
st.(10054) 
2. [2][2][2][2][2][2] , [0][0][0][0][0][0] 
unable to read the cmd header on the pmi context, Error = -1 
. 
Error posting readv, An existing connection was forcibly closed by the remote ho 
st.(10054) 
3. [3][3][3][3][3][3] , [0][0][0][0][0][0] 

job aborted: 
rank: node: exit code[: error message] 
0: jan-pc-nb: 1: Fatal error in MPI_Gatherv: Message truncated, error stack: 
MPI_Gatherv(363)........................: MPI_Gatherv failed(sbuf=0012FF4C, scou 
nt=0, MPI_INT, rbuf=0012FF2C, rcnts=0012FEF0, displs=0012FED8, MPI_INT, root=0, 
MPI_COMM_WORLD) failed 
MPIDI_CH3_PktHandler_EagerShortSend(351): Message from rank 3 and tag 4 truncate 
d; 12 bytes received but buffer size is 4 
1: jan-pc-nb: 1 
2: jan-pc-nb: 1 
3: jan-pc-nb: 1 
Press any key to continue . . . 

我的代码:

#include "mpi.h" 
#include <stdio.h> 

int main(int argc, char *argv[]) 
{ 
    int buffer[6]; 
    int rank, size, i; 
    int receive_counts[4] = { 0, 1, 2, 3 }; 
    int receive_displacements[4] = { 0, 0, 1, 3 }; 

    MPI_Init(&argc, &argv); 
    MPI_Comm_size(MPI_COMM_WORLD, &size); 
    MPI_Comm_rank(MPI_COMM_WORLD, &rank); 
    if (size != 4) 
    { 
     if (rank == 0) 
     { 
      printf("Please run with 4 processes\n");fflush(stdout); 
     } 
     MPI_Finalize(); 
     return 0; 
    } 
    for (i=0; i<rank; i++) 
    { 
     buffer[i] = rank; 
    } 
    MPI_Gatherv(buffer, rank, MPI_INT, buffer, receive_counts, receive_displacements, MPI_INT, 0, MPI_COMM_WORLD); 
    if (rank == 0) 
    { 
     for (i=0; i<6; i++) 
     { 
      printf("[%d]", buffer[i]); 
     } 
     printf("\n"); 
     fflush(stdout); 
    } 
    MPI_Finalize(); 
    return 0; 
} 

回答

1

退一步考虑一下MPI_Gatherv做:这是一个MPI_Gather(在这种情况下,排名0),其中每个处理器可以发送不同的数据。

在你的例子中,等级0发送0个int,等级1发送1个int,等级2发送2个int,等级3发送3个int。

MPIDI_CH3_PktHandler_EagerShortSend(351): Message from rank 3 and tag 4 truncated; 12 bytes received but buffer size is 4 

它埋在很多其他的信息,但它说,秩3派3个整数(12个字节),但等级0只有房1个INT。

查看gatherv的前三个参数:'buffer,rank,MPI_INT'。不管你设置的是什么,等级3总是会发送3个整数。

请注意,您可以在填充缓冲(你可以在receive_counts 100所进行的最后一个项目,说的),但是你告诉MPI库与小receive_counts[3]只想到1个INT,即使你送3。

+0

谢谢,那个排名作为论据让我困惑,我把它从它的发送地址和它发送数据的大小 –

+0

好,只是为了澄清,元组“buffer,count,datatype”显示了很多MPI。该元组描述了内存的位置和大小。 –