[ewg] Tools to diagnose rdma performance issue?
Chen, Mei-Jen
MChen at tmriusa.com
Thu Aug 15 08:21:33 PDT 2013
Hi Tom,
Our applications do not have requirements to use rdma/verbs, the only requirement is to achieve best performance and stability. The reason I pick rdma /verbs for evaluation are mainly because it seems easier to setup, and I were able to integrate the required layers with the OS easily. If MPI over PSM can tight with QLE7340 better, then I will try it out this way.
Is it typical that rdma/verbs have much more overhead than MPI (because of do verbs processing on the CPUs)? Since my current tested throughput is quite low, I wonder if there is something else...
The CPU I am using is Sandy bridge which I think is relatively new.
Mei-Jen
-----Original Message-----
From: Elken, Tom [mailto:tom.elken at intel.com]
Sent: Wednesday, August 14, 2013 5:20 PM
To: Chen, Mei-Jen; ewg at lists.openfabrics.org
Subject: RE: Tools to diagnose rdma performance issue?
Hi Mei-Jen,
Do your applications need to use rdma/verbs?
If you are mainly interested in MPI, you can use MPI over PSM with your QLE7340 (True Scale) HCAs for better performance and stability. PSM is a library that comes with OFED.
If so, basic MPI benchmarks such as IMB, OSU MPI Benchmarks will likely show a lot higher bandwidth than you are getting with the rdma tests you show below.
Also what CPU model are you using? That has an influence on rdma/verbs performance with the qib driver and these HCAs, which do verbs processing on the CPUs. I typically see a lot better performance than you show below on recent Intel CPUs.
-Tom
> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org [mailto:ewg-
> bounces at lists.openfabrics.org] On Behalf Of Chen, Mei-Jen
> Sent: Wednesday, August 14, 2013 1:20 PM
> To: ewg at lists.openfabrics.org
> Subject: [ewg] Tools to diagnose rdma performance issue?
>
> Hi,
>
> I am using OFED-3.5 , pretest-2.0, to check rdma performance between
> two 4x QDR HCAs which connects with a switch.
> As the tested performance results are much slower than expected
> (please see attached result below, I expected more than 3000 MB/s).
> There is probably something very wrong and I am struggling with find
> proper tools to diagnose/narrow down problems. Any suggestion is appreciated.
>
> The selected packages from OFED-3.5 were recompiled and installed on
> top of Linux 3.4 (with PREEMPT RT patch). The HCA driver comes from of kernel.org.
> So far I have tried perftest, qperf, ibv_rc_pingpong. They all have similar results.
> Iblinkinfo has shown all links are up with correct speed (4X 10.0 Gbps).
>
> Thanks.
>
>
> # ib_read_bw 192.168.200.2
> ----------------------------------------------------------------------
> ----------------- Device not recognized to implement inline feature.
> Disabling it
> ---------------------------------------------------------------------------------------
> RDMA_Read BW Test
> Dual-port : OFF Device : qib0
> Number of qps : 1
> Connection type : RC
> TX depth : 128
> CQ Moderation : 100
> Mtu : 2048B
> Link type : IB
> Outstand reads : 16
> rdma_cm QPs : OFF
> Data ex. method : Ethernet
> ----------------------------------------------------------------------
> ----------------- local address: LID 0x02 QPN 0x00a1 PSN 0x72c87b OUT
> 0x10 RKey 0x4f5000 VAddr 0x007f3190af2000 remote address: LID 0x01
> QPN 0x0095 PSN 0xca533a OUT 0x10 RKey 0x494a00 VAddr 0x007fe0f1f45000
> ---------------------------------------------------------------------------------------
> #bytes #iterations BW peak[MB/sec] BW average[MB/sec]
> MsgRate[Mpps]
> 65536 1000 1709.24 1709.23 0.027348
> ----------------------------------------------------------------------
> -----------------
>
> # ibstat
> CA 'qib0'
> CA type: InfiniPath_QLE7340
> Number of ports: 1
> Firmware version:
> Hardware version: 2
> Node GUID: 0x001175000070aafe
> System image GUID: 0x001175000070aafe
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 40
> Base lid: 1
> LMC: 0
> SM lid: 1
> Capability mask: 0x0761086a
> Port GUID: 0x001175000070aafe
> Link layer: InfiniBand
>
> _________________________________________________________________
> _____
> This email has been scanned by the Symantec Email Security.cloud service.
> For more information please visit http://www.symanteccloud.com
> _________________________________________________________________
> _____
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com ______________________________________________________________________
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________
More information about the ewg
mailing list