[ofa-general] Infiniband performance

Amir Vadai amirv at mellanox.co.il
Wed Dec 31 05:09:40 PST 2008


To be more specific - you need to have CAP_IPC_LOCK capability (will
enable sdp to pin pages while they are zero copied).

super use have it - and regular user could be configured to have it.


- Amir


Amir Vadai wrote:

> Sorry for the late answer.
>
> If you run iperf as root - sdp could use zero copy which should boost
> the performance.
>
>
> - Amir
>
>
> Jan Ruffing wrote:
>
>   
>> Hello,
>>
>> I'm new to Infiniband and still trying to get a grasp on what performance it can realistically deliver.
>>
>> The two directly connected test machines have Mellanox Infinihost III Lx DDR HCA cards installed and run OpenSuse 11 with a 2.6.25.16 Kernel.
>>
>> 1) Maximum Bandwidth?
>>
>> Infiniband (Double Data Rate, 4x lane) is advertised with a bandwidth of 20 Gbit/s. If my understanding is correct, this is only the signal rate, which would translate to a 16/Gbit/s data rate due to 8:10 encryption? The maximum speed I meassured so far was 12Gbit/s on the low-level-Protocolls:
>>
>> 	tamara /home/ruffing> ibv_rc_pingpong -m 2048 -s 1048576 -n 10000
>> 	local address:  LID 0x0001, QPN 0x3b0405, PSN 0x13a302
>> 	remote address: LID 0x0002, QPN 0x380405, PSN 0x9a46ba
>> 	20971520000 bytes in 13.63 seconds = 12313.27 Mbit/sec
>> 	10000 iters in 13.63 seconds = 1362.53 usec/iter
>> 	
>> 	melissa Dokumente/Infiniband> ibv_rc_pingpong 192.168.2.1 -m 2048 -s 1048576 -n 10000
>> 	local address:  LID 0x0002, QPN 0x380405, PSN 0x9a46ba
>> 	remote address: LID 0x0001, QPN 0x3b0405, PSN 0x13a302
>> 	20971520000 bytes in 13.63 seconds = 12313.38 Mbit/sec
>> 	10000 iters in 13.63 seconds = 1362.52 usec/iter
>>
>> Maximal user-level bandwidth was 11.5 GBit/s using RDMA:
>>
>> 	ruffing at melissa:~/Dokumente/Infiniband/NetPIPE-3.7.1> ./NPibv -m 2048 -t rdma_write -c local_poll -h 192.168.2.1 -n 100
>> 	Using RDMA Write communications
>> 	Using local polling completion
>> 	Preposting asynchronous receives (required for Infiniband)
>> 	Now starting the main loop
>> 	[...]
>> 	121: 8388605 bytes    100 times -->  11851.72 Mbps in    5400.06 usec
>> 	122: 8388608 bytes    100 times -->  11851.66 Mbps in    5400.09 usec
>> 	123: 8388611 bytes    100 times -->  11850.62 Mbps in    5400.57 usec
>>
>> That's actually 4 Gbit/s short of what I was hoping for. Yet I couldn't find any test results on the net that yielded more than 12 GBit/s on 4x DDR-HCAs. Where does this performance loss stem from? On first view, 4 GBit/s (25% of the data rate) looks quite a lot to be only protocol overhead...
>> Is 12 GBit/s the current maximum bandwidth, or is it possible for Infiniband users to improve performance beyond that?
>>
>>
>>
>> 2) TCP (over IPoIB) vs. RDMA/SDP/uverbs?
>>
>> On the first Infiniband installation using the packages of the OpenSuse 11 distribution, I got a TCP bandwidth of 10 GBit/s. (Which actually isn't that bad when compared to a meassured maximal bandwidth of 12 GBit/s.) This installation did neither support RDMA nor SDP, though.
>>
>> 	tamara iperf-2.0.4/src> ./iperf -c 192.168.2.2 -l 3M
>> 	------------------------------------------------------------
>> 	Client connecting to 192.168.2.2, TCP port 5001
>> 	TCP window size:   515 KByte (default)
>> 	------------------------------------------------------------
>> 	[  3] local 192.168.2.1 port 47730 connected with 192.168.2.2 port 5001
>> 	[ ID] Interval       Transfer     Bandwidth
>> 	[  3]  0.0-10.0 sec  11.6 GBytes  10.0 Gbits/sec
>>
>>
>>
>> After I installed the OFED 1.4 beta to be able to use SDP, RDMA and uverbs, I could use them to get of 12 GBit/s. Yet the TCP rate dropped by 2-3 GBit/s to 7-8 GBit/s.
>>
>> 	ruffing at tamara:~/Dokumente/Infiniband/iperf-2.0.4/src> ./iperf -c 192.168.2.2 -l 10M
>> 	------------------------------------------------------------
>> 	Client connecting to 192.168.2.2, TCP port 5001
>> 	TCP window size:   193 KByte (default)
>> 	------------------------------------------------------------
>> 	[  3] local 192.168.2.1 port 51988 connected with 192.168.2.2 port 5001
>> 	[ ID] Interval       Transfer     Bandwidth
>> 	[  3]  0.0-10.0 sec  8.16 GBytes  7.00 Gbits/sec
>>
>> What could have caused this loss of bandwidth? Is there a way to avoid it? Obviously, this could be a show stopper (for me) as far as native Infiniband protocolls are concerned: Gaining 2 GBit/sec under special circumstances probably won't outweigh loosing 3 GBit/s during normal use.
>>
>>
>>
>> 3) SDP performance
>>
>> The SDP performance (using preloading of libsdp.so) only meassured 6.2 GBit/s, even underperforming TCP:
>>
>> 	ruffing at tamara:~/Dokumente/Infiniband/iperf-2.0.4/src> LD_PRELOAD=/usr/lib/libsdp.so LIBSDP_CONFIG_FILE=/etc/libsdp.conf ./iperf -c 192.168.2.2 -l 10M
>> 	------------------------------------------------------------
>> 	Client connecting to 192.168.2.2, TCP port 5001
>> 	TCP window size: 16.0 MByte (default)
>> 	------------------------------------------------------------
>> 	[  4] local 192.168.2.1 port 36832 connected with 192.168.2.2 port 5001
>> 	[ ID] Interval       Transfer     Bandwidth
>> 	[  4]  0.0-10.0 sec  7.22 GBytes  6.20 Gbits/sec
>>
>> /etc/libsdp.conf consits of the following two lines:
>> 	use both server * *:*
>> 	use both client * *:*
>>
>> I have a hard time believing that's the max rate of SDP. (Even if Cisco meassured similar 6.6 GBit/s: https://www.cisco.com/en/US/docs/server_nw_virtual/commercial_host_driver/host_driver_linux/user/guide/sdp.html#wp948100)
>>
>> Did I mess up my Infiniband installation, or is SDP really slower than TCP over IPoIB?
>>
>>
>>
>> Sorry if my mail might sound somewhat negative, but I'm still trying to get past the marketing buzz and figure out what to realisticly expect of Infiniband. Currently, I'm still hoping that I messed up my installation somewhere, and that a few pointers in the right direction might resolve most of the issues... :)
>>
>> Thanks in advance,
>> Jan Ruffing
>>
>>
>>
>> Devices:
>>
>> 	tamara /dev/infiniband> ls -la
>> 	total 0
>> 	drwxr-xr-x  2 root root       140 2008-12-02 16:20 .
>> 	drwxr-xr-x 13 root root      4580 2008-12-09 14:59 ..
>> 	crw-rw----  1 root root  231,  64 2008-12-02 16:20 issm0
>> 	crw-rw-rw-  1 root users  10,  59 2008-11-27 10:24 rdma_cm
>> 	crw-rw----  1 root root  231,   0 2008-12-02 16:20 umad0
>> 	crw-rw-rw-  1 root users 231, 192 2008-11-27 10:15 uverbs0
>> 	crw-rw----  1 root users 231, 193 2008-11-27 10:15 uverbs1
>>
>>
>>
>> Installed Packages:
>>
>> 	Build ofa_kernel RPM
>> 	Install kernel-ib RPM:
>> 	Build ofed-scripts RPM
>> 	Install ofed-scripts RPM:
>> 	Install libibverbs RPM:
>> 	Install libibverbs-devel RPM:
>> 	Install libibverbs-devel-static RPM:
>> 	Install libibverbs-utils RPM:
>> 	Install libmthca RPM:
>> 	Install libmthca-devel-static RPM:
>> 	Install libmlx4 RPM:
>> 	Install libmlx4-devel RPM:
>> 	Install libcxgb3 RPM:
>> 	Install libcxgb3-devel RPM:
>> 	Install libnes RPM:
>> 	Install libnes-devel-static RPM:
>> 	Install libibcm RPM:
>> 	Install libibcm-devel RPM:
>> 	Install libibcommon RPM:
>> 	Install libibcommon-devel RPM:
>> 	Install libibcommon-static RPM:
>> 	Install libibumad RPM:
>> 	Install libibumad-devel RPM:
>> 	Install libibumad-static RPM:
>> 	Build libibmad RPM
>> 	Install libibmad RPM:
>> 	Install libibmad-devel RPM:
>> 	Install libibmad-static RPM:
>> 	Install ibsim RPM:
>> 	Install librdmacm RPM:
>> 	Install librdmacm-utils RPM:
>> 	Install librdmacm-devel RPM:
>> 	Install libsdp RPM:
>> 	Install libsdp-devel RPM:
>> 	Install opensm-libs RPM:
>> 	Install opensm RPM:
>> 	Install opensm-devel RPM:
>> 	Install opensm-static RPM:
>> 	Install compat-dapl RPM:
>> 	Install compat-dapl-devel RPM:
>> 	Install dapl RPM:
>> 	Install dapl-devel RPM:
>> 	Install dapl-devel-static RPM:
>> 	Install dapl-utils RPM:
>> 	Install perftest RPM:
>> 	Install mstflint RPM:
>> 	Install sdpnetstat RPM:
>> 	Install srptools RPM:
>> 	Install rds-tools RPM:
>> 	(installed ibutils manually)
>>
>>
>>
>> Loaded Modules:
>> (libsdp currently unloaded)
>>
>> 	Directory: /home/ruffing
>> 	tamara /home/ruffing> lsmod | grep ib
>> 	ib_addr                24580  1 rdma_cm
>> 	ib_ipoib               97576  0
>> 	ib_cm                  53584  2 rdma_cm,ib_ipoib
>> 	ib_sa                  55944  3 rdma_cm,ib_ipoib,ib_cm
>> 	ib_uverbs              56884  1 rdma_ucm
>> 	ib_umad                32016  4
>> 	mlx4_ib                79884  0
>> 	mlx4_core             114924  1 mlx4_ib
>> 	ib_mthca              148924  0
>> 	ib_mad                 53400  5 ib_cm,ib_sa,ib_umad,mlx4_ib,ib_mthca
>> 	ib_core                81152  12 rdma_ucm,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_sa,ib_uverbs,ib_umad,iw_cxgb3,mlx4_ib,ib_mthca,ib_mad
>> 	ipv6                  281064  23 ib_ipoib
>> 	rtc_lib                19328  1 rtc_core
>> 	libata                176604  2 ata_piix,pata_it8213
>> 	scsi_mod              168436  4 sr_mod,sg,sd_mod,libata
>> 	dock                   27536  1 libata
>> 	
>> 	tamara /home/ruffing> lsmod | grep rdma
>> 	rdma_ucm               30248  0
>> 	rdma_cm                49544  1 rdma_ucm
>> 	iw_cm                  25988  1 rdma_cm
>> 	ib_addr                24580  1 rdma_cm
>> 	ib_cm                  53584  2 rdma_cm,ib_ipoib
>> 	ib_sa                  55944  3 rdma_cm,ib_ipoib,ib_cm
>> 	ib_uverbs              56884  1 rdma_ucm
>> 	ib_core                81152  12 rdma_ucm,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_sa,ib_uverbs,ib_umad,iw_cxgb3,mlx4_ib,ib_mthca,ib_mad
>>
>>
>>
>>
>>
>>   
>>     
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>   




More information about the general mailing list