[openib-general] MPI error when using a "system" call in mpi job.

Ira Weiny weiny2 at llnl.gov
Tue Jun 13 17:11:47 PDT 2006


A co-worker here was seeing the following MPI error from his job:

[1] Abort: [ldev2:1] Got completion with error, code=1
 at line 2148 in file viacheck.c

After some tracking down he found that apparently if he used a "system" call
[int system(const char *string)] the next MPI command will fail.

I have been able to reproduce this with the attached simple "hello" program.

Perhaps someone has seen this type of error?  Here is the output from 2 runs:

weiny2 at ldev0:~/ior-test
17:04:04 > mpirun_rsh -rsh -hostfile hostfile -np 2 ./hello x
ldev1
[0] Abort: [ldev1:0] Got completion with error, code=1
 at line 2148 in file viacheck.c
ldev2
mpirun_rsh: Abort signaled from [0]
done.
weiny2 at ldev0:~/ior-test
17:05:23 > mpirun_rsh -rsh -hostfile hostfile -np 2 ./hello
now = 0.000000
now = 0.000052
now = 0.000094
now = 0.000121
now = 0.000151
now = 0.001072
now = 0.001102
now = 0.001118
now = 0.001141
now = 0.001160
done.

We are running mvapich 0.9.7 and the openib trunk rev 6829.

Thanks,
Ira

-------------- next part --------------
A non-text attachment was scrubbed...
Name: hello.c
Type: application/octet-stream
Size: 2784 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060613/0188139b/attachment.obj>


More information about the general mailing list