[openib-general] Problems running MPI jobs with MVAPICH and MVAPICH2

Don.Albert at Bull.com Don.Albert at Bull.com
Wed Mar 22 12:38:06 PST 2006


Matthew, 

Thanks for the response.  I can well believe that the problem is some 
error in the setup of my environment.  I just can't figure out what as 
yet.  As requested, some more information is below on the two systems 
'koa' and 'jatoba'.

> 
> Since the MPD daemons seem to have been started properly, as your 
ability
> to run non-MPI commands shows, I'm wondering if your $PATH may not be 
set
> properly when compiling the cpi application.
> 
> Can you just verify by 'echo'ing the $PATH before compiling? Can you 
also
> provide us with other information about your environment?
> 
> Also, when running MVAPICH2, can you try using the 'mpdtrace' command to
> verify that MPD has properly started on the requested nodes?
> 

For the MVAPICH (MPI-1) case:

On system 'koa':

[root at koa ib]# cat  .mv1        <-- using this file to set the PATH
echo "setting up MVAPICH environment"
export PATH=/usr/local/mvapich/bin:$PATH
[root at koa ib]# . .mv1           <-- sourcing the .mv1 file
setting up MVAPICH environment
[root at koa ib]# echo $PATH       <-- PATH after setup
/usr/local/mvapich/bin:/usr/kerberos/sbin:/sbin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin
[root at koa ib]# which mpicc      <-- showing where the compiler script is 
located
/usr/local/mvapich/bin/mpicc
[root at koa ib]# mpd &            <-- starting mpd
[1] 32649
[root at koa ib]# mpdtrace         <-- with just mpd running on 'koa'
mpdtrace: koa_53859:  lhs=koa_53859  rhs=koa_53859  rhs2=koa_53859 gen=1
[root at koa ib]# mpdtrace         <-- after starting mpd on the 'jatoba' 
system
mpdtrace: koa_53859:  lhs=jatoba_44789  rhs=jatoba_44789  rhs2=koa_53859 
gen=1
mpdtrace: jatoba_44789:  lhs=koa_53859  rhs=koa_53859  rhs2=jatoba_44789 
gen=1
[root at koa ib]# /sbin/lsmod      <-- showing that the various ib modules 
are loaded
Module                  Size  Used by
ib_sdp                 93448  0
ib_cm                  33472  1 ib_sdp
ib_ipoib               41480  0
ib_sa                  14652  2 ib_sdp,ib_ipoib
ib_uverbs              39760  0
ib_umad                15024  0
ib_mthca              120608  0
ib_mad                 37816  4 ib_cm,ib_sa,ib_umad,ib_mthca
ib_core                48640  8 
ib_sdp,ib_cm,ib_ipoib,ib_sa,ib_uverbs,ib_umad,ib_mthca,ib_mad


On system 'jatoba':

[root at jatoba ib]# cat .mv1
echo "setting up MVAPICH environment"
export PATH=/usr/local/mvapich/bin:$PATH
setting up MVAPICH environment
[root at jatoba ib]# echo $PATH
/usr/local/mvapich/bin:/usr/kerberos/sbin:/opt/gcc-3.3.3/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/local/bin:/usr/java/bin:/usr/ant/bin:/home/ib/bin
[root at jatoba ib]# which mpicc
/usr/local/mvapich/bin/mpicc
[root at jatoba ib]# mpd -h koa -p 53859 &   <-- starting mpd with parameters 
from 'koa'
[1] 9442
[root at jatoba ib]# mpdtrace                <-- after starting mpd
mpdtrace: jatoba_44789:  lhs=koa_53859  rhs=koa_53859  rhs2=jatoba_44789 
gen=1
mpdtrace: koa_53859:  lhs=jatoba_44789  rhs=jatoba_44789  rhs2=koa_53859 
gen=1
[root at jatoba ib]# /sbin/lsmod             <-- showing that the various ib 
modules are loaded
Module                  Size  Used by
ib_sdp                 93448  0
ib_cm                  33344  1 ib_sdp
ib_ipoib               41480  0
ib_sa                  14524  2 ib_sdp,ib_ipoib
ib_uverbs              39760  0
ib_umad                15024  0
ib_mthca              120608  0
ib_mad                 37816  4 ib_cm,ib_sa,ib_umad,ib_mthca
ib_core                48640  8 
ib_sdp,ib_cm,ib_ipoib,ib_sa,ib_uverbs,ib_umad,ib_mthca,ib_mad



For the MVAPICH2 case:

On 'koa':

[root at koa ib]# cat .mv2
echo "setting up MVAPICH2 environment"
export PATH=/tmp/mvapich2/bin:$PATH
[root at koa ib]# . .mv2
setting up MVAPICH2 environment
[root at koa ib]# echo $PATH
/tmp/mvapich2/bin:/usr/kerberos/sbin:/sbin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin
[root at koa ib]# which mpicc
/tmp/mvapich2/bin/mpicc
[root at koa ib]# mpdboot -n 2 -f /root/mpd.hosts
[root at koa ib]# mpdtrace -l
koa.az05.bull.com_53727
jatoba.az05.bull.com_54528

On 'jatoba':

[jatoba] (ib) ib> su
Password:
[root at jatoba ib]# cat .mv2
echo "setting up MPVAPICH2 environment"
export PATH=/tmp/mvapich2/bin:$PATH
[root at jatoba ib]# . .mv2
setting up MPVAPICH2 environment
[root at jatoba ib]# echo $PATH
/tmp/mvapich2/bin:/usr/kerberos/sbin:/opt/gcc-3.3.3/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/local/bin:/usr/java/bin:/usr/ant/bin:/home/ib/bin
[root at jatoba ib]# which mpicc
/tmp/mvapich2/bin/mpicc
[root at jatoba ib]# mpdtrace -l
jatoba.az05.bull.com_54528
koa.az05.bull.com_53727


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060322/fe7b98b6/attachment.html>


More information about the general mailing list