[ofa-general] DDR vs SDR performance

Stijn De Smet stijn.desmet at intec.ugent.be
Wed Nov 28 06:43:18 PST 2007


Hello,

I have a problem with the DDR performance:

Configuration:
2 servers (IBM x3755, equiped with 4 dualcore opteron and 16GB RAM)
3 HCA's installed (2 Cisco DDR(Cheetah) and 1 Cisco dual SDR(LionMini),
all PCI-e x8), all DDR HCA's at newest Cisco Firmware v1.2.917 build
3.2.0.149, with label 'HCA.Cheetah-DDR.20'

The  DDR's are connected with a cable, and s3n1 is running a SM. The 
SDR boards are connected over a Cisco SFS-7000D, but the DDR performance
is +- the same over this SFS-7000D

Both servers are running SLES10-SP1 with Ofed 1.2.5.


s3n1:~ # ibstatus
Infiniband device 'mthca0' port 1 status:   < -- DDR board #1, not connected
        default gid:     fe80:0000:0000:0000:0005:ad00:000b:cb39
        base lid:        0x0
        sm lid:          0x0
        state:           1: DOWN
        phys state:      2: Polling
        rate:            10 Gb/sec (4X)

Infiniband device 'mthca1' port 1 status:  <--- DDR board #2, connected
with cable
        default gid:     fe80:0000:0000:0000:0005:ad00:000b:cb31
        base lid:        0x16
        sm lid:          0x16
        state:           4: ACTIVE
        phys state:      5: LinkUp
        rate:            20 Gb/sec (4X DDR)

Infiniband device 'mthca2' port 1 status: <--- SDR board, only port 1
connected to the SFS-7000D
        default gid:     fe80:0000:0000:0000:0005:ad00:0008:a8d9
        base lid:        0x3
        sm lid:          0x2
        state:           4: ACTIVE
        phys state:      5: LinkUp
        rate:            10 Gb/sec (4X)

Infiniband device 'mthca2' port 2 status:
        default gid:     fe80:0000:0000:0000:0005:ad00:0008:a8da
        base lid:        0x0
        sm lid:          0x0
        state:           1: DOWN
        phys state:      2: Polling
        rate:            10 Gb/sec (4X)


RDMA test of :
-- SDR:
s3n2:~ # ib_rdma_bw -d mthca2 gpfs3n1
7190: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000
| duplex=0 | cma=0 |
7190: Local address:  LID 0x05, QPN 0x0408, PSN 0xf10f03 RKey 0x003b00
VAddr 0x002ba7b9943000
7190: Remote address: LID 0x03, QPN 0x040a, PSN 0xa9cf5c, RKey 0x003e00
VAddr 0x002adb2f3bb000


7190: Bandwidth peak (#0 to #989): 937.129 MB/sec
7190: Bandwidth average: 937.095 MB/sec
7190: Service Demand peak (#0 to #989): 2709 cycles/KB
7190: Service Demand Avg  : 2709 cycles/KB

-- DDR
s3n2:~ # ib_rdma_bw -d mthca1 gpfs3n1
7191: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000
| duplex=0 | cma=0 |
7191: Local address:  LID 0x10, QPN 0x0405, PSN 0x5e19e RKey 0x002600
VAddr 0x002b76eab20000
7191: Remote address: LID 0x16, QPN 0x0405, PSN 0xdd976e, RKey
0x80002900 VAddr 0x002ba8ed10e000


7191: Bandwidth peak (#0 to #990): 1139.32 MB/sec
7191: Bandwidth average: 1139.31 MB/sec
7191: Service Demand peak (#0 to #990): 2228 cycles/KB
7191: Service Demand Avg  : 2228 cycles/KB

So only 200MB/s increase between SDR and DDR
With comparable hardware(x3655, dual dualcore opteron, 8GB RAM), I get a
little bit better RDMA performance(1395MB/s so close to the PCI-e x8
limit), but even worse IPoIB and SDP performance with kernels 2.6.22 and
2.6.23.9 and Ofed 1.3b



IPoIB test(iperf), IPoIB in connected mode, MTU 65520:
#ib2 is SDR, ib1 is DDR
#SDR:
s3n2:~ # iperf -c cic-s3n1
------------------------------------------------------------
Client connecting to cic-s3n1, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[  3] local 192.168.1.2 port 50598 connected with 192.168.1.1 port 5001
[  3]  0.0-10.0 sec  6.28 GBytes  5.40 Gbits/sec

#DDR:
s3n2:~ # iperf -c cic-s3n1
------------------------------------------------------------
Client connecting to cic-s3n1, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[  3] local 192.168.1.2 port 32935 connected with 192.168.1.1 port 5001
[  3]  0.0-10.0 sec  6.91 GBytes  5.93 Gbits/sec


Now the increase is only 0.5Gbit

And finally a test with SDP:

DDR:
s3n2:~ # LD_PRELOAD=libsdp.so SIMPLE_LIBSDP="ok" iperf -c cic-s3n1
------------------------------------------------------------
Client connecting to cic-s3n1, TCP port 5001
TCP window size: 3.91 MByte (default)
------------------------------------------------------------
[  4] local 192.168.1.2 port 58186 connected with 192.168.1.1 port 5001
[  4]  0.0-10.0 sec  7.72 GBytes  6.63 Gbits/sec

#SDR:
s3n2:~ # LD_PRELOAD=libsdp.so SIMPLE_LIBSDP="ok" iperf -c cic-s3n1
------------------------------------------------------------
Client connecting to cic-s3n1, TCP port 5001
TCP window size: 3.91 MByte (default)
------------------------------------------------------------
[  4] local 192.168.1.2 port 58187 connected with 192.168.1.1 port 5001
[  4]  0.0-10.0 sec  7.70 GBytes  6.61 Gbits/sec

With SDP there is even no difference anymore between the 2 boards.


Even when using multiple connections(using 3 servers(s3s2,s3s3,s3s4),
x3655, 2.6.22, connecting all to one(s3s1) over DDR):
s3s2:~ # iperf -c cic-s3s1 -p 5001 -t 30
------------------------------------------------------------
Client connecting to cic-s3s1, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[  3] local 192.168.1.15 port 33576 connected with 192.168.1.14 port 5001
[  3]  0.0-30.0 sec  5.94 GBytes  1.70 Gbits/sec
s3s3:~ # iperf -c cic-s3s1 -p 5002 -t 30
------------------------------------------------------------
Client connecting to cic-s3s1, TCP port 5002
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[  3] local 192.168.1.16 port 53558 connected with 192.168.1.14 port 5002
[  3]  0.0-30.0 sec  5.74 GBytes  1.64 Gbits/sec
s3s4:~ # iperf -c cic-s3s1 -p 5003 -t 30
------------------------------------------------------------
Client connecting to cic-s3s1, TCP port 5003
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[  3] local 192.168.1.17 port 37169 connected with 192.168.1.14 port 5003
[  3]  0.0-30.0 sec  5.79 GBytes  1.66 Gbits/sec


This gives a total of 1.7+1.64+1.66Gbits/sec=5Gbits/sec

Is this normal behavior(SDP and IPoIB not benefiting from DDR)?


Regards,
Stijn



More information about the general mailing list