[ofa-general] Infiniband performance

Gilad Shainer Shainer at Mellanox.com
Thu Dec 11 07:17:46 PST 2008


On the maximum BW you are correct - IB is capable for 16Gb/s data rate. You are seeing 12Gb/s due to the host chipset bandwidth limitation.

Gilad.
  

-----Original Message-----
From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Jan Ruffing
Sent: Thursday, December 11, 2008 1:56 AM
To: general at lists.openfabrics.org
Subject: [ofa-general] Infiniband performance

Hello,

I'm new to Infiniband and still trying to get a grasp on what performance it can realistically deliver.

The two directly connected test machines have Mellanox Infinihost III Lx DDR HCA cards installed and run OpenSuse 11 with a 2.6.25.16 Kernel.

1) Maximum Bandwidth?

Infiniband (Double Data Rate, 4x lane) is advertised with a bandwidth of 20 Gbit/s. If my understanding is correct, this is only the signal rate, which would translate to a 16/Gbit/s data rate due to 8:10 encryption? The maximum speed I meassured so far was 12Gbit/s on the low-level-Protocolls:

	tamara /home/ruffing> ibv_rc_pingpong -m 2048 -s 1048576 -n 10000
	local address:  LID 0x0001, QPN 0x3b0405, PSN 0x13a302
	remote address: LID 0x0002, QPN 0x380405, PSN 0x9a46ba
	20971520000 bytes in 13.63 seconds = 12313.27 Mbit/sec
	10000 iters in 13.63 seconds = 1362.53 usec/iter
	
	melissa Dokumente/Infiniband> ibv_rc_pingpong 192.168.2.1 -m 2048 -s 1048576 -n 10000
	local address:  LID 0x0002, QPN 0x380405, PSN 0x9a46ba
	remote address: LID 0x0001, QPN 0x3b0405, PSN 0x13a302
	20971520000 bytes in 13.63 seconds = 12313.38 Mbit/sec
	10000 iters in 13.63 seconds = 1362.52 usec/iter

Maximal user-level bandwidth was 11.5 GBit/s using RDMA:

	ruffing at melissa:~/Dokumente/Infiniband/NetPIPE-3.7.1> ./NPibv -m 2048 -t rdma_write -c local_poll -h 192.168.2.1 -n 100
	Using RDMA Write communications
	Using local polling completion
	Preposting asynchronous receives (required for Infiniband)
	Now starting the main loop
	[...]
	121: 8388605 bytes    100 times -->  11851.72 Mbps in    5400.06 usec
	122: 8388608 bytes    100 times -->  11851.66 Mbps in    5400.09 usec
	123: 8388611 bytes    100 times -->  11850.62 Mbps in    5400.57 usec

That's actually 4 Gbit/s short of what I was hoping for. Yet I couldn't find any test results on the net that yielded more than 12 GBit/s on 4x DDR-HCAs. Where does this performance loss stem from? On first view, 4 GBit/s (25% of the data rate) looks quite a lot to be only protocol overhead...
Is 12 GBit/s the current maximum bandwidth, or is it possible for Infiniband users to improve performance beyond that?



2) TCP (over IPoIB) vs. RDMA/SDP/uverbs?

On the first Infiniband installation using the packages of the OpenSuse 11 distribution, I got a TCP bandwidth of 10 GBit/s. (Which actually isn't that bad when compared to a meassured maximal bandwidth of 12 GBit/s.) This installation did neither support RDMA nor SDP, though.

	tamara iperf-2.0.4/src> ./iperf -c 192.168.2.2 -l 3M
	------------------------------------------------------------
	Client connecting to 192.168.2.2, TCP port 5001
	TCP window size:   515 KByte (default)
	------------------------------------------------------------
	[  3] local 192.168.2.1 port 47730 connected with 192.168.2.2 port 5001
	[ ID] Interval       Transfer     Bandwidth
	[  3]  0.0-10.0 sec  11.6 GBytes  10.0 Gbits/sec



After I installed the OFED 1.4 beta to be able to use SDP, RDMA and uverbs, I could use them to get of 12 GBit/s. Yet the TCP rate dropped by 2-3 GBit/s to 7-8 GBit/s.

	ruffing at tamara:~/Dokumente/Infiniband/iperf-2.0.4/src> ./iperf -c 192.168.2.2 -l 10M
	------------------------------------------------------------
	Client connecting to 192.168.2.2, TCP port 5001
	TCP window size:   193 KByte (default)
	------------------------------------------------------------
	[  3] local 192.168.2.1 port 51988 connected with 192.168.2.2 port 5001
	[ ID] Interval       Transfer     Bandwidth
	[  3]  0.0-10.0 sec  8.16 GBytes  7.00 Gbits/sec

What could have caused this loss of bandwidth? Is there a way to avoid it? Obviously, this could be a show stopper (for me) as far as native Infiniband protocolls are concerned: Gaining 2 GBit/sec under special circumstances probably won't outweigh loosing 3 GBit/s during normal use.



3) SDP performance

The SDP performance (using preloading of libsdp.so) only meassured 6.2 GBit/s, even underperforming TCP:

	ruffing at tamara:~/Dokumente/Infiniband/iperf-2.0.4/src> LD_PRELOAD=/usr/lib/libsdp.so LIBSDP_CONFIG_FILE=/etc/libsdp.conf ./iperf -c 192.168.2.2 -l 10M
	------------------------------------------------------------
	Client connecting to 192.168.2.2, TCP port 5001
	TCP window size: 16.0 MByte (default)
	------------------------------------------------------------
	[  4] local 192.168.2.1 port 36832 connected with 192.168.2.2 port 5001
	[ ID] Interval       Transfer     Bandwidth
	[  4]  0.0-10.0 sec  7.22 GBytes  6.20 Gbits/sec

/etc/libsdp.conf consits of the following two lines:
	use both server * *:*
	use both client * *:*

I have a hard time believing that's the max rate of SDP. (Even if Cisco meassured similar 6.6 GBit/s: https://www.cisco.com/en/US/docs/server_nw_virtual/commercial_host_driver/host_driver_linux/user/guide/sdp.html#wp948100)

Did I mess up my Infiniband installation, or is SDP really slower than TCP over IPoIB?



Sorry if my mail might sound somewhat negative, but I'm still trying to get past the marketing buzz and figure out what to realisticly expect of Infiniband. Currently, I'm still hoping that I messed up my installation somewhere, and that a few pointers in the right direction might resolve most of the issues... :)

Thanks in advance,
Jan Ruffing



Devices:

	tamara /dev/infiniband> ls -la
	total 0
	drwxr-xr-x  2 root root       140 2008-12-02 16:20 .
	drwxr-xr-x 13 root root      4580 2008-12-09 14:59 ..
	crw-rw----  1 root root  231,  64 2008-12-02 16:20 issm0
	crw-rw-rw-  1 root users  10,  59 2008-11-27 10:24 rdma_cm
	crw-rw----  1 root root  231,   0 2008-12-02 16:20 umad0
	crw-rw-rw-  1 root users 231, 192 2008-11-27 10:15 uverbs0
	crw-rw----  1 root users 231, 193 2008-11-27 10:15 uverbs1



Installed Packages:

	Build ofa_kernel RPM
	Install kernel-ib RPM:
	Build ofed-scripts RPM
	Install ofed-scripts RPM:
	Install libibverbs RPM:
	Install libibverbs-devel RPM:
	Install libibverbs-devel-static RPM:
	Install libibverbs-utils RPM:
	Install libmthca RPM:
	Install libmthca-devel-static RPM:
	Install libmlx4 RPM:
	Install libmlx4-devel RPM:
	Install libcxgb3 RPM:
	Install libcxgb3-devel RPM:
	Install libnes RPM:
	Install libnes-devel-static RPM:
	Install libibcm RPM:
	Install libibcm-devel RPM:
	Install libibcommon RPM:
	Install libibcommon-devel RPM:
	Install libibcommon-static RPM:
	Install libibumad RPM:
	Install libibumad-devel RPM:
	Install libibumad-static RPM:
	Build libibmad RPM
	Install libibmad RPM:
	Install libibmad-devel RPM:
	Install libibmad-static RPM:
	Install ibsim RPM:
	Install librdmacm RPM:
	Install librdmacm-utils RPM:
	Install librdmacm-devel RPM:
	Install libsdp RPM:
	Install libsdp-devel RPM:
	Install opensm-libs RPM:
	Install opensm RPM:
	Install opensm-devel RPM:
	Install opensm-static RPM:
	Install compat-dapl RPM:
	Install compat-dapl-devel RPM:
	Install dapl RPM:
	Install dapl-devel RPM:
	Install dapl-devel-static RPM:
	Install dapl-utils RPM:
	Install perftest RPM:
	Install mstflint RPM:
	Install sdpnetstat RPM:
	Install srptools RPM:
	Install rds-tools RPM:
	(installed ibutils manually)



Loaded Modules:
(libsdp currently unloaded)

	Directory: /home/ruffing
	tamara /home/ruffing> lsmod | grep ib
	ib_addr                24580  1 rdma_cm
	ib_ipoib               97576  0
	ib_cm                  53584  2 rdma_cm,ib_ipoib
	ib_sa                  55944  3 rdma_cm,ib_ipoib,ib_cm
	ib_uverbs              56884  1 rdma_ucm
	ib_umad                32016  4
	mlx4_ib                79884  0
	mlx4_core             114924  1 mlx4_ib
	ib_mthca              148924  0
	ib_mad                 53400  5 ib_cm,ib_sa,ib_umad,mlx4_ib,ib_mthca
	ib_core                81152  12 rdma_ucm,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_sa,ib_uverbs,ib_umad,iw_cxgb3,mlx4_ib,ib_mthca,ib_mad
	ipv6                  281064  23 ib_ipoib
	rtc_lib                19328  1 rtc_core
	libata                176604  2 ata_piix,pata_it8213
	scsi_mod              168436  4 sr_mod,sg,sd_mod,libata
	dock                   27536  1 libata
	
	tamara /home/ruffing> lsmod | grep rdma
	rdma_ucm               30248  0
	rdma_cm                49544  1 rdma_ucm
	iw_cm                  25988  1 rdma_cm
	ib_addr                24580  1 rdma_cm
	ib_cm                  53584  2 rdma_cm,ib_ipoib
	ib_sa                  55944  3 rdma_cm,ib_ipoib,ib_cm
	ib_uverbs              56884  1 rdma_ucm
	ib_core                81152  12 rdma_ucm,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_sa,ib_uverbs,ib_umad,iw_cxgb3,mlx4_ib,ib_mthca,ib_mad





--
Jan Ruffing
Software Developer

Motama GmbH
Lortzingstraße 10 · 66111 Saarbrücken · Germany tel +49 681 940 85 50 · fax +49 681 940 85 49 ruffing at motama.com · www.motama.com

Companies register · district council Saarbrücken · HRB 15249 CEOs · Dr.-Ing. Marco Lohse, Michael Repplinger

This e-mail may contain confidential and/or privileged information. 
If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

_______________________________________________
general mailing list
general at lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



More information about the general mailing list