[ewg] 200m cable results in slower rdma read performance? [ CC Anti-Virus checked ]

Rupert Dance rsdance at soft-forge.com
Mon Oct 17 07:25:40 PDT 2011


Koen,

 

Can you try running ibdiagnet -P all=1 -ls 10 -lw 4x

 

This will tell us if any links are not running at Link speed of 10 (QDR) and
Link Width of 4x.

 

You may also want to suggest an upgrade of OFED to 1.5.3.2 GA. There have
been major improvements in the stack since 1.4.2. Also please be sure that
you update the firmware in all hardware for the same reason.

 

Thanks

 

Rupert

 

From: koen_segers at computacenter.com [mailto:koen_segers at computacenter.com] 
Sent: Monday, October 17, 2011 9:22 AM
To: rsdance at soft-forge.com; ewg at lists.openfabrics.org
Cc: koen_segers at computacenter.com
Subject: RE: [ewg] 200m cable results in slower rdma read performance? [ CC
Anti-Virus checked ]

 

Rupert,

 

Thanks for replying. 

 

Below is the output of the ibdiagnet command.

I don't see any issues here. Just  tell me if you need more info.

 

I forgot to mention that we are using the following switch version:

edgeprod1# version show

        version: 3.6.0

        date:    Jun 07 2011 11:19:33 AM

        build Id:857

 

And the default SLES 11 SP1 ofed build: ofed-1.4.2-0.9.6

 

Best regards,

 

15:00:28|root at gpfsprod1n1:~ 0 # ibdiagnet

Loading IBDIAGNET from: /usr/lib64/ibdiagnet1.2

-W- Topology file is not specified.

    Reports regarding cluster links will use direct routes.

Loading IBDM from: /usr/lib64/ibdm1.2

-W- A few ports of local device are up.

    Since port-num was not specified (-p option), port 1 of device 1 will be

    used as the local port.

-I- Discovering ... 39 nodes (6 Switches & 33 CA-s) discovered.

 

 

-I---------------------------------------------------

-I- Bad Guids/LIDs Info

-I---------------------------------------------------

-I- No bad Guids were found

 

-I---------------------------------------------------

-I- Links With Logical State = INIT

-I---------------------------------------------------

-I- No bad Links (with logical state = INIT) were found

 

-I---------------------------------------------------

-I- PM Counters Info

-I---------------------------------------------------

-I- No illegal PM counters values were found

 

-I---------------------------------------------------

-I- Fabric Partitions Report (see ibdiagnet.pkey for a full hosts list)

-I---------------------------------------------------

-I-    PKey:0x7fff Hosts:65 full:65 partial:0

 

-I---------------------------------------------------

-I- IPoIB Subnets Check

-I---------------------------------------------------

-I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps
SL:0x00

-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps

 

-I---------------------------------------------------

-I- Bad Links Info

-I- No bad link were found

-I---------------------------------------------------

----------------------------------------------------------------

-I- Stages Status Report:

    STAGE                                    Errors Warnings

    Bad GUIDs/LIDs Check                     0      0

    Link State Active Check                  0      0

    Performance Counters Report              0      0

    Partitions Check                         0      0

    IPoIB Subnets Check                      0      1

 

Please see /tmp/ibdiagnet.log for complete log

----------------------------------------------------------------

 

-I- Done. Run time was 5 seconds.

 

 

This type of info is given in ibdiagnet.lst:

 

{ SW Ports:24 SystemGUID:0008f10500108fa9 NodeGUID:0008f10500108fa8
PortGUID:0008f10500108fa8 VenID:000008F1 DevID:5A5A0000 Rev:000000A1
{Voltaire

4036 # edgeprod3} LID:0001 PN:05 } { CA Ports:02 SystemGUID:0002c903004ab175
NodeGUID:0002c903004ab172 PortGUID:0002c903004ab173 VenID:000002C9 D

evID:673C0000 Rev:000000B0 { HCA-1} LID:001A PN:01 } PHY=4x LOG=ACT SPD=10

 

 

 

 

Koen Segers

Enterprise Consultant

 

Computacenter

Services & Solutions

 

Ikaroslaan 31

B-1930 Zaventem

Belgium

 

Tel: +32 2 704 94 67

Fax: +32 2 704 95 95

Mob: +32 497 909353

 <mailto:koen_segers at computacenter.com> koen_segers at computacenter.com

 <http://www.computacenter.com/benelux> www.computacenter.com/benelux

 

From: Rupert Dance <rsdance at soft-forge.com> [mailto:Rupert Dance
<rsdance at soft-forge.com>] 
Sent: 17 October 2011 13:46
To: <koen_segers at computacenter.com>; <ewg at lists.openfabrics.org>
Subject: RE: [ewg] 200m cable results in slower rdma read performance? [ CC
Anti-Virus checked ]

 

Hi,

 

Have you run ibdiagnet to verify that your link width and speed is what you
expect on all links?

 

Thanks

 

Rupert Dance

 

Software Forge

 

From: ewg-bounces at lists.openfabrics.org
[mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of
koen_segers at computacenter.com
Sent: Monday, October 17, 2011 3:22 AM
To: ewg at lists.openfabrics.org
Subject: [ewg] 200m cable results in slower rdma read performance? [ CC
Anti-Virus checked ]

 

Hi, 

 

In my test setup I have 3 servers of which 2 are residing in Datacenter1 and
the other in Datacenter2.

If I do a rdma test between datacenters, I get a much lower performance than
if I would do the test between servers residing in the same datacenter.

 

 

DC1: gpfsprod1n1, gpfsprod1n3

DC2: gpfsprod1n2

 

08:54:48|root at gpfsprod1n1:~ 0 # qperf -t 5  cic-gpfsprod1n2 rc_rdma_write_bw

rc_rdma_write_bw:

    bw  =  1.9 GB/sec

08:54:59|root at gpfsprod1n1:~ 0 # qperf -t 5  cic-gpfsprod1n3 rc_rdma_write_bw

rc_rdma_write_bw:

    bw  =  3.39 GB/sec

 

 

The setup contains two pairs of edge switches (on each datacenter one pair)
and two spine switches (each datacenter one), configured as a non blocking
fat tree.

So:

The servers are connected to the edge switches.

The spine switches are connected to all edge switches.

 

These are the cables we are using:

.         Length 5m Vendor Name: WLGORE Code: QSFP+ Vendor PN: 498385-B24
Vendor Rev: D Vendor SN xxxx

.         Length 200m Vendor Name: MOLEX Code: QSFP+ Vendor PN: 106410-1200
Vendor Rev: A Vendor SN xxxx

 

 

Can someone tell me why this is so? And maybe how I can solve this?

 

Best regards,

 

 

Koen Segers

Enterprise Consultant

 

Computacenter

Services & Solutions

 

Ikaroslaan 31

B-1930 Zaventem

Belgium

 

Tel: +32 2 704 94 67

Fax: +32 2 704 95 95

Mob: +32 497 909353

 <mailto:koen_segers at computacenter.com> koen_segers at computacenter.com

 <http://www.computacenter.com/benelux> www.computacenter.com/benelux

 




Visit us at http://www.computacenter.com/
Computacenter: Transforming IT service delivery.

========================== Disclaimer ==================================
The information in this email is confidential, and is intended solely for
the addressee(s). If you are not the intended recipient of this email please
let us know by reply and then delete it from your system; you should not
copy this message or disclose its contents to anyone. Due to the integrity
risk of sending emails over the Internet, Computacenter will accept no
liability for any comments and / or attachments contained within this email.
========================== Disclaimer ==================================




Visit us at http://www.computacenter.com/
Computacenter: Transforming IT service delivery.

========================== Disclaimer ==================================
The information in this email is confidential, and is intended solely for
the addressee(s). If you are not the intended recipient of this email please
let us know by reply and then delete it from your system; you should not
copy this message or disclose its contents to anyone. Due to the integrity
risk of sending emails over the Internet, Computacenter will accept no
liability for any comments and / or attachments contained within this email.
========================== Disclaimer ==================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20111017/d8751fe3/attachment.html>


More information about the ewg mailing list