[Users] Failed to exchange data between server and clients

German Anders ganders at despegar.com
Wed Mar 30 11:35:14 PDT 2016


Hi All,
I'm having some issues with my IB network, basically I've two SX6036G as
Spine switches, two SX6036F as Leaf switches each connected to both Spines,
SM is running on one of the Spines. Then I've connected some supermicro
servers to the Leaf switches, create a BOND active-passive, ping response
fine, but, when I try to run a ib_write_bw test I'm getting:


Server:

# ib_write_bw

************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
                    RDMA_Write BW Test
 Dual-port       : OFF        Device         : mlx4_0
 Number of qps   : 1        Transport type : IB
 Connection type : RC        Using SRQ      : OFF
 CQ Moderation   : 100
 Mtu             : 2048[B]
 Link type       : IB
 Max inline data : 0[B]
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0x63 QPN 0x0296 PSN 0xa23c53 RKey 0xb8011100 VAddr
0x007fc3c655c000
 remote address: LID 0x03 QPN 0x0219 PSN 0xc0ef7c RKey 0x8011100 VAddr
0x007f62cc318000
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]
MsgRate[Mpps]
ethernet_read_keys: Couldn't read remote address
 Unable to read to socket/rdam_cm
 Failed to exchange data between server and clients


Client:

# ib_write_bw 172.23.16.1
---------------------------------------------------------------------------------------
                    RDMA_Write BW Test
 Dual-port       : OFF        Device         : mlx4_1
 Number of qps   : 1        Transport type : IB
 Connection type : RC        Using SRQ      : OFF
 TX depth        : 128
 CQ Moderation   : 100
 Mtu             : 2048[B]
 Link type       : IB
 Max inline data : 0[B]
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0x03 QPN 0x0219 PSN 0xc0ef7c RKey 0x8011100 VAddr
0x007f62cc318000
 remote address: LID 0x63 QPN 0x0296 PSN 0xa23c53 RKey 0xb8011100 VAddr
0x007fc3c655c000
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]
MsgRate[Mpps]
Problems with warm up


the bond configuration is the following:

auto ib0
iface ib0 inet manual
    bond-master bond-PUB
    bond-primary ib0
    pre-up echo connected > /sys/class/net/ib0/mode

auto ib1
iface ib1 inet manual
    bond-master bond-PUB
    pre-up echo connected > /sys/class/net/ib1/mode

auto bond-PUB
iface bond-PUB inet static
    address 172.23.16.1
    netmask 255.255.240.0
    network 172.23.16.0
    bond-mode active-backup
    bond-miimon 100
    bond-slaves none


AND


auto ib1
iface ib1 inet manual
    bond-master bond-PUB
    bond-primary ib1
    pre-up echo connected > /sys/class/net/ib1/mode

auto ib3
iface ib3 inet manual
    bond-master bond-PUB
    pre-up echo connected > /sys/class/net/ib3/mode

auto bond-PUB
iface bond-PUB inet static
    address 172.23.17.4
    netmask 255.255.240.0
    network 172.23.16.0
    bond-mode active-backup
    bond-miimon 100
    bond-slaves none

##
auto ib0
iface ib0 inet manual
    bond-master bond-CLUS
    bond-primary ib0
    pre-up echo connected > /sys/class/net/ib0/mode

auto ib2
iface ib2 inet manual
    bond-master bond-CLUS
    pre-up echo connected > /sys/class/net/ib2/mode

auto bond-CLUS
iface bond-CLUS inet static
    address 172.23.32.4
    netmask 255.255.240.0
    network 172.23.32.0
    bond-mode active-backup
    bond-miimon 100
    bond-slaves none


Any ideas? Also the performance of a FIO operation between hosts is REALLY
slow.. on the range of 100-150MB/s...with different operations and bs.

Thanks in advance,

Best,


*German*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20160330/83e66a0d/attachment.html>


More information about the Users mailing list