[ewg] Tools to diagnose rdma performance issue?

Elken, Tom tom.elken at intel.com
Wed Aug 14 15:20:29 PDT 2013


Hi Mei-Jen,

Do your applications need to use rdma/verbs?

If you are mainly interested in MPI, you can use MPI over PSM with your QLE7340 (True Scale) HCAs for better performance and stability.  PSM is  a library that comes with OFED.
If so, basic MPI benchmarks such as IMB, OSU MPI Benchmarks will likely show a lot higher bandwidth than you are getting with the rdma tests you show below.

Also what CPU model are you using?  That has an influence on rdma/verbs performance with the qib driver and these HCAs, which do verbs processing on the CPUs.  I typically see a lot better performance than you show below on recent Intel CPUs.

-Tom

> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org [mailto:ewg-
> bounces at lists.openfabrics.org] On Behalf Of Chen, Mei-Jen
> Sent: Wednesday, August 14, 2013 1:20 PM
> To: ewg at lists.openfabrics.org
> Subject: [ewg] Tools to diagnose rdma performance issue?
> 
> Hi,
> 
> I am using OFED-3.5 , pretest-2.0, to check  rdma performance between two 4x
> QDR HCAs which connects with  a switch.
> As the tested performance results are much slower than expected (please see
> attached result below, I expected more than 3000 MB/s).  There is probably
> something very wrong and I am struggling with find proper tools to
> diagnose/narrow down problems.  Any suggestion is appreciated.
> 
> The selected packages from OFED-3.5 were recompiled and installed on top of
> Linux 3.4 (with PREEMPT RT patch). The HCA driver comes from of kernel.org.
> So far I have tried perftest, qperf, ibv_rc_pingpong. They all have similar results.
> Iblinkinfo has shown all links are up with correct speed (4X 10.0 Gbps).
> 
> Thanks.
> 
> 
> # ib_read_bw 192.168.200.2
> ---------------------------------------------------------------------------------------
> Device not recognized to implement inline feature. Disabling it
> ---------------------------------------------------------------------------------------
>                     RDMA_Read BW Test
>  Dual-port       : OFF		Device : qib0
>  Number of qps   : 1
>  Connection type : RC
>  TX depth        : 128
>  CQ Moderation   : 100
>  Mtu             : 2048B
>  Link type       : IB
>  Outstand reads  : 16
>  rdma_cm QPs	 : OFF
>  Data ex. method : Ethernet
> ---------------------------------------------------------------------------------------
>  local address: LID 0x02 QPN 0x00a1 PSN 0x72c87b OUT 0x10 RKey 0x4f5000
> VAddr 0x007f3190af2000
>  remote address: LID 0x01 QPN 0x0095 PSN 0xca533a OUT 0x10 RKey 0x494a00
> VAddr 0x007fe0f1f45000
> ---------------------------------------------------------------------------------------
>  #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]
> MsgRate[Mpps]
>  65536      1000           1709.24            1709.23		   0.027348
> ---------------------------------------------------------------------------------------
> 
> # ibstat
> CA 'qib0'
> 	CA type: InfiniPath_QLE7340
> 	Number of ports: 1
> 	Firmware version:
> 	Hardware version: 2
> 	Node GUID: 0x001175000070aafe
> 	System image GUID: 0x001175000070aafe
> 	Port 1:
> 		State: Active
> 		Physical state: LinkUp
> 		Rate: 40
> 		Base lid: 1
> 		LMC: 0
> 		SM lid: 1
> 		Capability mask: 0x0761086a
> 		Port GUID: 0x001175000070aafe
> 		Link layer: InfiniBand
> 
> _________________________________________________________________
> _____
> This email has been scanned by the Symantec Email Security.cloud service.
> For more information please visit http://www.symanteccloud.com
> _________________________________________________________________
> _____
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg



More information about the ewg mailing list