[openib-general] Problems running MPI jobs with MVAPICH and MVAPICH2

Don.Albert at Bull.com Don.Albert at Bull.com
Wed Mar 22 14:06:09 PST 2006


Weikuan,

> 
> However, let us look into this slightly differently. Does the problem 
> happen to 2 processes on the same node too? If so, we can focus on such 

No, when I run on only one machine (either one) I can execute multiple 
copies of the jobs:

[root at koa cpi]# mpirun_mpd -np 2 ./cpi
pi is approximately 3.1416009869231241, Error is 0.0000083333333309
wall clock time = 0.000088
Process 0 on koa.az05.bull.com
Process 1 on koa.az05.bull.com
[root at koa cpi]# mpirun_mpd -np 4 ./cpi
Process 0 on koa.az05.bull.com
Process 1 on koa.az05.bull.com
Process 2 on koa.az05.bull.com
Process 3 on koa.az05.bull.com
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 0.000165
 
> a case and get it cleared up first. Let us say we use `koa'. Could you 
> provide the following information to us? We can do that for 
> mvapich-gen2 first, assuming the cause of the problem will be similar 
> for the other case.
> 
> a) echo $LD_LIBRARY_PATH

$LD_LIBRARY_PATH is null

> b) The modified build script. The template you used for mvapich-gen2 
> should be make.mvapich.gen2.
> c) build/configure log file: config.log, make.log, etc.

These files are attached below.

> d) The output of this command:
> # mpicc -show -o cpi example/basic/cpi.c

[root at koa mvapich-gen2]# mpicc -show -o cpi examples/basic/cpi.c
gcc -DUSE_STDARG -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_UNISTD_H=1 
-DHAVE_STDARG_H=1 -DUSE_STDARG=1 -DMALLOC_RET_VOID=1 -c 
examples/basic/cpi.c -I/usr/local/mvapich/include
gcc -DUSE_STDARG -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_UNISTD_H=1 
-DHAVE_STDARG_H=1 -DUSE_STDARG=1 -DMALLOC_RET_VOID=1 
-L/usr/local/mvapich/lib cpi.o -o cpi -lmpich -L/usr/local/lib 
-Wl,-rpath=/usr/local/lib -libverbs -lpthread


Also here is the output of "ldd" on the executable file:

[root at koa mvapich-gen2]# ldd /home/ib/tests/mpi/cpi/cpi
        libibverbs.so.1 => /usr/local/lib/libibverbs.so.1 
(0x00002aaaaaaac000)
        libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x000000352cb00000)
        libc.so.6 => /lib64/tls/libc.so.6 (0x000000352bc00000)
        libsysfs.so.1 => /usr/lib64/libsysfs.so.1 (0x00000038b1900000)
        libdl.so.2 => /lib64/libdl.so.2 (0x000000352bf00000)
        /lib64/ld-linux-x86-64.so.2 (0x000000352ba00000)


        -Don Albert-


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060322/6843bcc5/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.mvapich.gen2
Type: application/octet-stream
Size: 2591 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060322/6843bcc5/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config.status
Type: application/octet-stream
Size: 21377 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060322/6843bcc5/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config-mine.log
Type: application/octet-stream
Size: 17573 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060322/6843bcc5/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: install-mine.log
Type: application/octet-stream
Size: 1168 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060322/6843bcc5/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config.log
Type: application/octet-stream
Size: 7138 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060322/6843bcc5/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make-mine.log
Type: application/octet-stream
Size: 282277 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060322/6843bcc5/attachment-0005.obj>


More information about the general mailing list