|
一个MPI fortran测试程序如下:
- program rrr
- use mpi
- implicit real*8(a-h,o-z)
- integer :: world_size, world_rank, mpi_ierr
- call mpi_init(mpi_ierr)
- call mpi_comm_size(mpi_comm_world, world_size, mpi_ierr)
- call mpi_comm_rank(mpi_comm_world, world_rank, mpi_ierr)
- if(world_rank==0)then
- write(*,*) "input d"
- read(*,*) d
- do i=1,world_size-1
- call mpi_send(d, 1, mpi_double, i, i, mpi_comm_world, mpi_ierr)
- end do
- end if
- call mpi_barrier(mpi_comm_world,mpi_ierr)
- if(world_rank/=0)then
- call mpi_recv(d, 1, mpi_double, 0, world_rank, mpi_comm_world, status, mpi_ierr)
- end if
- write(*,*) "d=",d, "id=", world_rank
- call mpi_finalize(mpi_ierr)
- end program
复制代码 在我本地的CentOS 7上运行没有任何问题:
- [root@localhost:~/wbh/test/mpi#]mpiifort read.f90
- [root@localhost:~/wbh/test/mpi#]mpirun -np 4 ./a.out
- input d
- 4
- d= 4.00000000000000 id= 0
- d= 4.00000000000000 id= 1
- d= 4.00000000000000 id= 2
- d= 4.00000000000000 id= 3
复制代码 但是移到集群上编译运行会报错:
- [wbh@cu10 mpi]$ mpiifort read.f90
- [wbh@cu10 mpi]$ mpirun -np 4 ./a.out
- input d
- input d
- forrtl: severe (24): end-of-file during read, unit -4, file /proc/12535/fd/0
- Image PC Routine Line Source
- a.out 0000000000426BF4 Unknown Unknown Unknown
- a.out 0000000000407AD0 Unknown Unknown Unknown
- a.out 0000000000402A27 Unknown Unknown Unknown
- a.out 000000000040291E Unknown Unknown Unknown
- libc.so.6 0000003FEA821B45 Unknown Unknown Unknown
- a.out 0000000000402819 Unknown Unknown Unknown
- -------------------------------------------------------
- Primary job terminated normally, but 1 process returned
- a non-zero exit code.. Per user-direction, the job has been aborted.
- -------------------------------------------------------
- forrtl: severe (24): end-of-file during read, unit -4, file /proc/12534/fd/0
- Image PC Routine Line Source
- a.out 0000000000426BF4 Unknown Unknown Unknown
- a.out 0000000000407AD0 Unknown Unknown Unknown
- a.out 0000000000402A27 Unknown Unknown Unknown
- a.out 000000000040291E Unknown Unknown Unknown
- libc.so.6 0000003FEA821B45 Unknown Unknown Unknown
- a.out 0000000000402819 Unknown Unknown Unknown
- --------------------------------------------------------------------------
- mpirun detected that one or more processes exited with non-zero status, thus causing
- the job to be terminated. The first process to do so was:
- Process name: [[19269,1],1]
- Exit code: 24
- --------------------------------------------------------------------------
复制代码 而且没有看懂也没有搜到这个报错是什么意思。这个要怎么解决呢? 我其实有怀疑是编译器版本不同,但版本不至于导致程序跑不起来吧?(集群上的是ifort 15.0.3)
|
|