[openib-general] [patch] libsdp typo in config_parser

Scott Weitzenkamp (sweitzen) sweitzen at cisco.com
Fri Aug 18 17:03:52 PDT 2006


Running an MPI command with LD_PRELOAD=libsdp.so at the beginning won't
cause SDP to be used on remote nodes.  You have to find a way to load
libsdp.so on all nodes, this might work better:

  LD_PRELOAD=libsdp.so mpirun -np 4 env LD_PRELOAD=libsdp.so
/there/vasp/20060503/vasp.4.6/vasp.mpi

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
 

> -----Original Message-----
> From: openib-general-bounces at openib.org 
> [mailto:openib-general-bounces at openib.org] On Behalf Of 
> Bernhard Fischer
> Sent: Friday, August 18, 2006 12:22 PM
> To: Eitan Zahavi
> Cc: openib-general at openib.org
> Subject: Re: [openib-general] [patch] libsdp typo in config_parser
> 
> On Fri, Aug 18, 2006 at 10:05:35PM +0300, Eitan Zahavi wrote:
> >Hi Bernhard 
> >
> >SDP traffic will not show on the IPoIB counters. It does no 
> go through
> >IPoIB.
> 
> That's what i thought, thanks for confirming.
> >You can use 
> >lsmod | grep ib_sdp 
> >to see how many connections are made over SDP.
> 
> Running lam via 2 nodes, on 2 CPUs each, i see:
> # lsmod | grep ib_sdp
> ib_sdp                 28184  4 
> rdma_cm                27912  1 ib_sdp
> ib_core                53632  12
> ib_ucm,ib_uverbs,ib_sdp,rdma_cm,ib_cm,ib_local_sa,ib_umad,ib_i
> poib,ib_multicast,ib_sa,ib_mthca,ib_mad
> 
> I did start lamboot with libsdp.so preloaded:
> $ LD_PRELOAD=/usr/local/lib64/libsdp.so lamboot l
> $ lamnodes C -c -n
> node13ib.infiniband
> node13ib.infiniband
> node15ib.infiniband
> node15ib.infiniband
> $ LD_PRELOAD=/usr/local/lib64/libsdp.so mpirun -np 4 
> /there/vasp/20060503/vasp.4.6/vasp.mpi
> 
> Still, ifconfig ib0 (which hosts node??ib.infiniband on 
> 10.100.0.0/24) shows that the
> communication is being sent over ipoib as ifconfigs counters 
> constantly
> go up when communicating (only one user is active on the system).
> $ /sbin/ifconfig ib0
> ib0       Link encap:UNSPEC  HWaddr 
> 00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00  
>           inet addr:10.100.0.13  Bcast:10.100.0.255  
> Mask:255.255.255.0
>           UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
>           RX packets:182037964 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:183607689 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:128 
>           RX bytes:189334244937 (180563.2 Mb)  TX 
> bytes:194777918565 (185754.6 Mb)
> 
> My libsdp.conf looks like this:
> $  cat /usr/local/etc/libsdp.conf 
> #log min-level 1 destination file libsdp.log
> use both    connect *         10.100.0.0/24:*
> use both    server  *         10.100.0.0/24:*
> 
> So i fear i'm missing something crucial.
> Ideas?
> 
> >Exact number of packets and data can flowing through the IB 
> port can be
> >obtained by :
> >/sys/class/infiniband/mthca0/ports/1/counters/port_rcv_packets
> >/sys/class/infiniband/mthca0/ports/1/counters/port_xmit_packets
> 
> $ for i in 
> /sys/class/infiniband/mthca0/ports/1/counters/*packets;do 
> echo -n $i:' ' ; cat $i;done
> /sys/class/infiniband/mthca0/ports/1/counters/port_rcv_packets
> : 185010549
> /sys/class/infiniband/mthca0/ports/1/counters/port_xmit_packet
> s: 186584856
> 
> PS: The different pingpong test (which have outdated names in 
> the openib
> wiki, btw) do work just fine if run from the very same user, 
> so i think
> that the basic verbs communication would work proper.
> 
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 




More information about the general mailing list