[openib-general] Fork issues with simple MPI program
Arlin Davis
arlin.r.davis at intel.com
Mon Feb 19 10:32:27 PST 2007
We are seeing some fork issues with a simple MPI program (attached) running on a 2.6.16+ kernels and
OFED 1.1. We have tried both Intel MPI and mvapich2 with the same results:
t_fork> mpiexec -n 2 t_system_fork
parent process
[0] started child process with pid=31552
send desc error
parent process
[0] Abort: [] Got completion with error 1, vendor code=69, dest rank=1
at line 540 in file ibv_channel_manager.c
[1] I am child process with pid=25437
[1] started child process with pid=25437
[0] I am child process with pid=31552
child process
[1] finished pid=25437
child process
[0] finished pid=31552
rank 0 in job 2 svlmpicl400_32925 caused collective abort of all ranks
exit status of rank 0: return code 252
If you run mvapich2 for uDAPL, it hangs before second MPI_Barrier() just like Intel MPI. If you use
the I_MPI_RDMA_USE_EVD_FALLBACK=1 option with Intel MPI you get the following error similar to
mvapich2:
parent process
parent process
[0] I am child process with pid=9596
[0] started child process with pid=9596
[1] I am child process with pid=11477
[1] started child process with pid=11477
[0][rdma_iba.c:1007] Intel MPI fatal error: DTO operation completed with error. status=0x2.
cookie=0x1
[1][rdma_iba.c:1007] Intel MPI fatal error: DTO operation completed with error. status=0x2.
cookie=0x1
child process
[1] finished pid=11477
child process
[0] finished pid=9596
rank 0 in job 8 cst-19_54707 caused collective abort of all ranks
exit status of rank 0: return code 255
Any insight would be greatly appreciated. It was our assumption that the parent process can continue
to use IB resources after the fixes went into 2.6.16 and OFED 1.1. Is this true?
Thanks,
-arlin
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: t_system_fork.c
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070219/96dabe76/attachment.c>
More information about the general
mailing list