2016-06-30 116 views
0

我正在学习如何使用OpenMPI和Fortran。通过使用OpenMPI文档,我试图创建一个简单的客户端/服务器程序。然而,当我运行它从客户端收到以下错误:“ORTE_ERROR_LOG:文件dpm_orte.c中未找到第167行”导致使用OpenMPI的Fortran程序崩溃

[Laptop:13402] [[54220,1],0] ORTE_ERROR_LOG: Not found in file dpm_orte.c at line 167 
[Laptop:13402] *** An error occurred in MPI_Comm_connect 
[Laptop:13402] *** reported by process [3553361921,0] 
[Laptop:13402] *** on communicator MPI_COMM_WORLD 
[Laptop:13402] *** MPI_ERR_INTERN: internal error 
[Laptop:13402] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, 
[Laptop:13402] *** and potentially your MPI job) 
------------------------------------------------------- 
Primary job terminated normally, but 1 process returned 
a non-zero exit code.. Per user-direction, the job has been aborted. 
------------------------------------------------------- 
-------------------------------------------------------------------------- 
mpiexec detected that one or more processes exited with non-zero status, thus causing 
the job to be terminated. The first process to do so was: 
    Process name: [[54220,1],0] 
    Exit code: 17 
-------------------------------------------------------------------------- 

代码为服务器和客户端可以看到下面:

server.f90

program name 
use mpi 
implicit none 

    ! type declaration statements 
    INTEGER :: ierr, size, newcomm, loop, buf(255), status(MPI_STATUS_SIZE) 
    CHARACTER(MPI_MAX_PORT_NAME) :: port_name 

    ! executable statements 
    call MPI_Init(ierr) 
    call MPI_Comm_size(MPI_COMM_WORLD, size, ierr) 
    call MPI_Open_port(MPI_INFO_NULL, port_name, ierr) 
    print *, "Port name is: ", port_name 

    do while (.true.) 
     call MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, newcomm, ierr) 

     loop = 1 
     do while (loop .eq. 1) 
      call MPI_Recv(buf, 255, MPI_INTEGER, MPI_ANY_SOURCE, MPI_ANY_TAG, newcomm, status, ierr) 
      print *, "Looping the loop." 
      loop = 0 

     enddo 

     call MPI_Comm_free(newcomm, ierr) 
     call MPI_Close_port(port_name, ierr) 
     call MPI_Finalize(ierr)  

    enddo 

end program name 

客户端。 F90

program name 
use mpi 
implicit none 

    ! type declaration statements 
    INTEGER :: ierr, buf(255), tag, newcomm 
    CHARACTER(MPI_MAX_PORT_NAME) :: port_name 
    LOGICAL :: done 

    ! executable statements 
    call MPI_Init(ierr) 
    print *, "Please provide me with the port name: " 
    read(*,*) port_name 

    call MPI_Comm_connect(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, newcomm, ierr) 

    done = .false. 
    do while (.not. done) 
     tag = 0 
     call MPI_Send(buf, 255, MPI_INTEGER, 0, tag, newcomm, ierr) 
     done = .true. 
    enddo 

    call MPI_Send(buf, 0, MPI_INTEGER, 0, 1, newcomm, ierr) 
    call MPI_Comm_Disconnect(newcomm, ierr) 
    call MPI_Finalize(ierr) 

end program name 

我用mpif90 server.f90 -o server.outmpif90 client.f90 -o client.out编译和mpiexec -np 1 server.outmpiexec -np 1 client.out来运行程序。它为客户端提供端口名称(即当我在read之后按Enter键)发生错误时。

which dpm_orte.c回报dpm_orte.c not found

我运行Linux和我从拱额外安装的openmpi 1.10.3-1。

+0

@ d_1999,我移动了MPI_Finalize()并尝试了,但仍然存在问题。 – GLaDER

回答

3

这是一个普通的Fortran输入处理错误,与MPI没有任何关系(除了Open MPI输出完全不可理解的错误信息外)。只需插入线client.f90看完后马上打印的port_name值:

print *, "Please provide me with the port name: " 
read(*,*) port_name 
print *, port_name 

用实际端口名称是像2527592448.0;tcp://10.0.1.6,10.0.1.2,192.168.122.1,10.10.11.10:55837+2527592449.0;tcp://10.0.1.6,10.0.1.4,192.168.122.1,10.10.11.10::300输出将2527592448.0。列表导向输入将;作为分隔符并在其后停止读取,因此传递给MPI_COMM_CONNECT的端口地址不完整。

溶液与

read(*,'(A)') port_name 

此外,在服务器中的循环写的不好取代read(*,*) port_name。您不能多次致电MPI_FINALIZE。关闭端口也是一个不好的主意,因为您之后立即致电MPI_COMM_ACCEPT。正确的循环将是:

! executable statements 
call MPI_Init(ierr) 
call MPI_Comm_size(MPI_COMM_WORLD, size, ierr) 
call MPI_Open_port(MPI_INFO_NULL, port_name, ierr) 
print *, "Port name is: ", port_name 

do while (.true.) 
    call MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, newcomm, ierr) 

    loop = 1 
    do while (loop .eq. 1) 
     call MPI_Recv(buf, 255, MPI_INTEGER, MPI_ANY_SOURCE, MPI_ANY_TAG, newcomm, status, ierr) 
     print *, "Looping the loop." 
     loop = 0 
    enddo 

    call MPI_Comm_disconnect(newcomm, ierr) 
    call MPI_Comm_free(newcomm, ierr) 
enddo 

call MPI_Close_port(port_name, ierr) 
call MPI_Finalize(ierr) 
+0

非常感谢你的“阅读”澄清。至于循环,我知道它不好,但我陷入了另一个错误,只是不想做更多。再次谢谢你! – GLaDER