[openib-general] MPI error when using a "system" call in mpi job.
Ira Weiny
weiny2 at llnl.gov
Tue Jun 13 17:11:47 PDT 2006
A co-worker here was seeing the following MPI error from his job:
[1] Abort: [ldev2:1] Got completion with error, code=1
at line 2148 in file viacheck.c
After some tracking down he found that apparently if he used a "system" call
[int system(const char *string)] the next MPI command will fail.
I have been able to reproduce this with the attached simple "hello" program.
Perhaps someone has seen this type of error? Here is the output from 2 runs:
weiny2 at ldev0:~/ior-test
17:04:04 > mpirun_rsh -rsh -hostfile hostfile -np 2 ./hello x
ldev1
[0] Abort: [ldev1:0] Got completion with error, code=1
at line 2148 in file viacheck.c
ldev2
mpirun_rsh: Abort signaled from [0]
done.
weiny2 at ldev0:~/ior-test
17:05:23 > mpirun_rsh -rsh -hostfile hostfile -np 2 ./hello
now = 0.000000
now = 0.000052
now = 0.000094
now = 0.000121
now = 0.000151
now = 0.001072
now = 0.001102
now = 0.001118
now = 0.001141
now = 0.001160
done.
We are running mvapich 0.9.7 and the openib trunk rev 6829.
Thanks,
Ira
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hello.c
Type: application/octet-stream
Size: 2784 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060613/0188139b/attachment.obj>
More information about the general
mailing list