[ofa-general] DDR vs SDR performance
Stijn De Smet
stijn.desmet at intec.ugent.be
Wed Nov 28 06:43:18 PST 2007
Hello,
I have a problem with the DDR performance:
Configuration:
2 servers (IBM x3755, equiped with 4 dualcore opteron and 16GB RAM)
3 HCA's installed (2 Cisco DDR(Cheetah) and 1 Cisco dual SDR(LionMini),
all PCI-e x8), all DDR HCA's at newest Cisco Firmware v1.2.917 build
3.2.0.149, with label 'HCA.Cheetah-DDR.20'
The DDR's are connected with a cable, and s3n1 is running a SM. The
SDR boards are connected over a Cisco SFS-7000D, but the DDR performance
is +- the same over this SFS-7000D
Both servers are running SLES10-SP1 with Ofed 1.2.5.
s3n1:~ # ibstatus
Infiniband device 'mthca0' port 1 status: < -- DDR board #1, not connected
default gid: fe80:0000:0000:0000:0005:ad00:000b:cb39
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 10 Gb/sec (4X)
Infiniband device 'mthca1' port 1 status: <--- DDR board #2, connected
with cable
default gid: fe80:0000:0000:0000:0005:ad00:000b:cb31
base lid: 0x16
sm lid: 0x16
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 20 Gb/sec (4X DDR)
Infiniband device 'mthca2' port 1 status: <--- SDR board, only port 1
connected to the SFS-7000D
default gid: fe80:0000:0000:0000:0005:ad00:0008:a8d9
base lid: 0x3
sm lid: 0x2
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 10 Gb/sec (4X)
Infiniband device 'mthca2' port 2 status:
default gid: fe80:0000:0000:0000:0005:ad00:0008:a8da
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 10 Gb/sec (4X)
RDMA test of :
-- SDR:
s3n2:~ # ib_rdma_bw -d mthca2 gpfs3n1
7190: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000
| duplex=0 | cma=0 |
7190: Local address: LID 0x05, QPN 0x0408, PSN 0xf10f03 RKey 0x003b00
VAddr 0x002ba7b9943000
7190: Remote address: LID 0x03, QPN 0x040a, PSN 0xa9cf5c, RKey 0x003e00
VAddr 0x002adb2f3bb000
7190: Bandwidth peak (#0 to #989): 937.129 MB/sec
7190: Bandwidth average: 937.095 MB/sec
7190: Service Demand peak (#0 to #989): 2709 cycles/KB
7190: Service Demand Avg : 2709 cycles/KB
-- DDR
s3n2:~ # ib_rdma_bw -d mthca1 gpfs3n1
7191: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000
| duplex=0 | cma=0 |
7191: Local address: LID 0x10, QPN 0x0405, PSN 0x5e19e RKey 0x002600
VAddr 0x002b76eab20000
7191: Remote address: LID 0x16, QPN 0x0405, PSN 0xdd976e, RKey
0x80002900 VAddr 0x002ba8ed10e000
7191: Bandwidth peak (#0 to #990): 1139.32 MB/sec
7191: Bandwidth average: 1139.31 MB/sec
7191: Service Demand peak (#0 to #990): 2228 cycles/KB
7191: Service Demand Avg : 2228 cycles/KB
So only 200MB/s increase between SDR and DDR
With comparable hardware(x3655, dual dualcore opteron, 8GB RAM), I get a
little bit better RDMA performance(1395MB/s so close to the PCI-e x8
limit), but even worse IPoIB and SDP performance with kernels 2.6.22 and
2.6.23.9 and Ofed 1.3b
IPoIB test(iperf), IPoIB in connected mode, MTU 65520:
#ib2 is SDR, ib1 is DDR
#SDR:
s3n2:~ # iperf -c cic-s3n1
------------------------------------------------------------
Client connecting to cic-s3n1, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[ 3] local 192.168.1.2 port 50598 connected with 192.168.1.1 port 5001
[ 3] 0.0-10.0 sec 6.28 GBytes 5.40 Gbits/sec
#DDR:
s3n2:~ # iperf -c cic-s3n1
------------------------------------------------------------
Client connecting to cic-s3n1, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[ 3] local 192.168.1.2 port 32935 connected with 192.168.1.1 port 5001
[ 3] 0.0-10.0 sec 6.91 GBytes 5.93 Gbits/sec
Now the increase is only 0.5Gbit
And finally a test with SDP:
DDR:
s3n2:~ # LD_PRELOAD=libsdp.so SIMPLE_LIBSDP="ok" iperf -c cic-s3n1
------------------------------------------------------------
Client connecting to cic-s3n1, TCP port 5001
TCP window size: 3.91 MByte (default)
------------------------------------------------------------
[ 4] local 192.168.1.2 port 58186 connected with 192.168.1.1 port 5001
[ 4] 0.0-10.0 sec 7.72 GBytes 6.63 Gbits/sec
#SDR:
s3n2:~ # LD_PRELOAD=libsdp.so SIMPLE_LIBSDP="ok" iperf -c cic-s3n1
------------------------------------------------------------
Client connecting to cic-s3n1, TCP port 5001
TCP window size: 3.91 MByte (default)
------------------------------------------------------------
[ 4] local 192.168.1.2 port 58187 connected with 192.168.1.1 port 5001
[ 4] 0.0-10.0 sec 7.70 GBytes 6.61 Gbits/sec
With SDP there is even no difference anymore between the 2 boards.
Even when using multiple connections(using 3 servers(s3s2,s3s3,s3s4),
x3655, 2.6.22, connecting all to one(s3s1) over DDR):
s3s2:~ # iperf -c cic-s3s1 -p 5001 -t 30
------------------------------------------------------------
Client connecting to cic-s3s1, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[ 3] local 192.168.1.15 port 33576 connected with 192.168.1.14 port 5001
[ 3] 0.0-30.0 sec 5.94 GBytes 1.70 Gbits/sec
s3s3:~ # iperf -c cic-s3s1 -p 5002 -t 30
------------------------------------------------------------
Client connecting to cic-s3s1, TCP port 5002
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[ 3] local 192.168.1.16 port 53558 connected with 192.168.1.14 port 5002
[ 3] 0.0-30.0 sec 5.74 GBytes 1.64 Gbits/sec
s3s4:~ # iperf -c cic-s3s1 -p 5003 -t 30
------------------------------------------------------------
Client connecting to cic-s3s1, TCP port 5003
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[ 3] local 192.168.1.17 port 37169 connected with 192.168.1.14 port 5003
[ 3] 0.0-30.0 sec 5.79 GBytes 1.66 Gbits/sec
This gives a total of 1.7+1.64+1.66Gbits/sec=5Gbits/sec
Is this normal behavior(SDP and IPoIB not benefiting from DDR)?
Regards,
Stijn
More information about the general
mailing list