[openib-general] [patch] libsdp typo in config_parser

Bernhard Fischer rep.nop at aon.at
Fri Aug 18 12:22:03 PDT 2006


On Fri, Aug 18, 2006 at 10:05:35PM +0300, Eitan Zahavi wrote:
>Hi Bernhard 
>
>SDP traffic will not show on the IPoIB counters. It does no go through
>IPoIB.

That's what i thought, thanks for confirming.
>You can use 
>lsmod | grep ib_sdp 
>to see how many connections are made over SDP.

Running lam via 2 nodes, on 2 CPUs each, i see:
# lsmod | grep ib_sdp
ib_sdp                 28184  4 
rdma_cm                27912  1 ib_sdp
ib_core                53632  12
ib_ucm,ib_uverbs,ib_sdp,rdma_cm,ib_cm,ib_local_sa,ib_umad,ib_ipoib,ib_multicast,ib_sa,ib_mthca,ib_mad

I did start lamboot with libsdp.so preloaded:
$ LD_PRELOAD=/usr/local/lib64/libsdp.so lamboot l
$ lamnodes C -c -n
node13ib.infiniband
node13ib.infiniband
node15ib.infiniband
node15ib.infiniband
$ LD_PRELOAD=/usr/local/lib64/libsdp.so mpirun -np 4 /there/vasp/20060503/vasp.4.6/vasp.mpi

Still, ifconfig ib0 (which hosts node??ib.infiniband on 10.100.0.0/24) shows that the
communication is being sent over ipoib as ifconfigs counters constantly
go up when communicating (only one user is active on the system).
$ /sbin/ifconfig ib0
ib0       Link encap:UNSPEC  HWaddr 00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00  
          inet addr:10.100.0.13  Bcast:10.100.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:182037964 errors:0 dropped:0 overruns:0 frame:0
          TX packets:183607689 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:128 
          RX bytes:189334244937 (180563.2 Mb)  TX bytes:194777918565 (185754.6 Mb)

My libsdp.conf looks like this:
$  cat /usr/local/etc/libsdp.conf 
#log min-level 1 destination file libsdp.log
use both    connect *         10.100.0.0/24:*
use both    server  *         10.100.0.0/24:*

So i fear i'm missing something crucial.
Ideas?

>Exact number of packets and data can flowing through the IB port can be
>obtained by :
>/sys/class/infiniband/mthca0/ports/1/counters/port_rcv_packets
>/sys/class/infiniband/mthca0/ports/1/counters/port_xmit_packets

$ for i in /sys/class/infiniband/mthca0/ports/1/counters/*packets;do echo -n $i:' ' ; cat $i;done
/sys/class/infiniband/mthca0/ports/1/counters/port_rcv_packets: 185010549
/sys/class/infiniband/mthca0/ports/1/counters/port_xmit_packets: 186584856

PS: The different pingpong test (which have outdated names in the openib
wiki, btw) do work just fine if run from the very same user, so i think
that the basic verbs communication would work proper.





More information about the general mailing list