[ewg] 200m cable results in slower rdma read performance? [ CC Anti-Virus checked ]

koen_segers at computacenter.com koen_segers at computacenter.com
Mon Oct 17 06:22:02 PDT 2011


Rupert,

 

Thanks for replying. 

 

Below is the output of the ibdiagnet command.

I don't see any issues here. Just  tell me if you need more info.

 

I forgot to mention that we are using the following switch version:

edgeprod1# version show

        version: 3.6.0

        date:    Jun 07 2011 11:19:33 AM

        build Id:857

 

And the default SLES 11 SP1 ofed build: ofed-1.4.2-0.9.6

 

Best regards,

 

15:00:28|root at gpfsprod1n1:~ 0 # ibdiagnet

Loading IBDIAGNET from: /usr/lib64/ibdiagnet1.2

-W- Topology file is not specified.

    Reports regarding cluster links will use direct routes.

Loading IBDM from: /usr/lib64/ibdm1.2

-W- A few ports of local device are up.

    Since port-num was not specified (-p option), port 1 of device 1
will be

    used as the local port.

-I- Discovering ... 39 nodes (6 Switches & 33 CA-s) discovered.

 

 

-I---------------------------------------------------

-I- Bad Guids/LIDs Info

-I---------------------------------------------------

-I- No bad Guids were found

 

-I---------------------------------------------------

-I- Links With Logical State = INIT

-I---------------------------------------------------

-I- No bad Links (with logical state = INIT) were found

 

-I---------------------------------------------------

-I- PM Counters Info

-I---------------------------------------------------

-I- No illegal PM counters values were found

 

-I---------------------------------------------------

-I- Fabric Partitions Report (see ibdiagnet.pkey for a full hosts list)

-I---------------------------------------------------

-I-    PKey:0x7fff Hosts:65 full:65 partial:0

 

-I---------------------------------------------------

-I- IPoIB Subnets Check

-I---------------------------------------------------

-I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps
SL:0x00

-W- Suboptimal rate for group. Lowest member rate:40Gbps >
group-rate:10Gbps

 

-I---------------------------------------------------

-I- Bad Links Info

-I- No bad link were found

-I---------------------------------------------------

----------------------------------------------------------------

-I- Stages Status Report:

    STAGE                                    Errors Warnings

    Bad GUIDs/LIDs Check                     0      0

    Link State Active Check                  0      0

    Performance Counters Report              0      0

    Partitions Check                         0      0

    IPoIB Subnets Check                      0      1

 

Please see /tmp/ibdiagnet.log for complete log

----------------------------------------------------------------

 

-I- Done. Run time was 5 seconds.

 

 

This type of info is given in ibdiagnet.lst:

 

{ SW Ports:24 SystemGUID:0008f10500108fa9 NodeGUID:0008f10500108fa8
PortGUID:0008f10500108fa8 VenID:000008F1 DevID:5A5A0000 Rev:000000A1
{Voltaire

4036 # edgeprod3} LID:0001 PN:05 } { CA Ports:02
SystemGUID:0002c903004ab175 NodeGUID:0002c903004ab172
PortGUID:0002c903004ab173 VenID:000002C9 D

evID:673C0000 Rev:000000B0 { HCA-1} LID:001A PN:01 } PHY=4x LOG=ACT
SPD=10

 

 

 

 

Koen Segers

Enterprise Consultant

 

Computacenter

Services & Solutions

 

Ikaroslaan 31

B-1930 Zaventem

Belgium

 

Tel: +32 2 704 94 67

Fax: +32 2 704 95 95

Mob: +32 497 909353

koen_segers at computacenter.com <mailto:koen_segers at computacenter.com> 

www.computacenter.com/benelux <http://www.computacenter.com/benelux> 

 

From: Rupert Dance <rsdance at soft-forge.com> [mailto:Rupert Dance
<rsdance at soft-forge.com>] 
Sent: 17 October 2011 13:46
To: <koen_segers at computacenter.com>; <ewg at lists.openfabrics.org>
Subject: RE: [ewg] 200m cable results in slower rdma read performance? [
CC Anti-Virus checked ]

 

Hi,

 

Have you run ibdiagnet to verify that your link width and speed is what
you expect on all links?

 

Thanks

 

Rupert Dance

 

Software Forge

 

From: ewg-bounces at lists.openfabrics.org
[mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of
koen_segers at computacenter.com
Sent: Monday, October 17, 2011 3:22 AM
To: ewg at lists.openfabrics.org
Subject: [ewg] 200m cable results in slower rdma read performance? [ CC
Anti-Virus checked ]

 

Hi, 

 

In my test setup I have 3 servers of which 2 are residing in Datacenter1
and the other in Datacenter2.

If I do a rdma test between datacenters, I get a much lower performance
than if I would do the test between servers residing in the same
datacenter.

 

 

DC1: gpfsprod1n1, gpfsprod1n3

DC2: gpfsprod1n2

 

08:54:48|root at gpfsprod1n1:~ 0 # qperf -t 5  cic-gpfsprod1n2
rc_rdma_write_bw

rc_rdma_write_bw:

    bw  =  1.9 GB/sec

08:54:59|root at gpfsprod1n1:~ 0 # qperf -t 5  cic-gpfsprod1n3
rc_rdma_write_bw

rc_rdma_write_bw:

    bw  =  3.39 GB/sec

 

 

The setup contains two pairs of edge switches (on each datacenter one
pair) and two spine switches (each datacenter one), configured as a non
blocking fat tree.

So:

The servers are connected to the edge switches.

The spine switches are connected to all edge switches.

 

These are the cables we are using:

.         Length 5m Vendor Name: WLGORE Code: QSFP+ Vendor PN:
498385-B24 Vendor Rev: D Vendor SN xxxx

.         Length 200m Vendor Name: MOLEX Code: QSFP+ Vendor PN:
106410-1200 Vendor Rev: A Vendor SN xxxx

 

 

Can someone tell me why this is so? And maybe how I can solve this?

 

Best regards,

 

 

Koen Segers

Enterprise Consultant

 

Computacenter

Services & Solutions

 

Ikaroslaan 31

B-1930 Zaventem

Belgium

 

Tel: +32 2 704 94 67

Fax: +32 2 704 95 95

Mob: +32 497 909353

koen_segers at computacenter.com <mailto:koen_segers at computacenter.com> 

www.computacenter.com/benelux <http://www.computacenter.com/benelux> 

 




Visit us at http://www.computacenter.com/
Computacenter: Transforming IT service delivery.

========================== Disclaimer ==================================
The information in this email is confidential, and is intended solely
for the addressee(s). If you are not the intended recipient of this
email please let us know by reply and then delete it from your system;
you should not copy this message or disclose its contents to anyone. Due
to the integrity risk of sending emails over the Internet, Computacenter
will accept no liability for any comments and / or attachments contained
within this email.
========================== Disclaimer ==================================



Visit us at http://www.computacenter.com/
Computacenter: Transforming IT service delivery.

========================== Disclaimer ==================================
The information in this email is confidential, and is intended solely for the addressee(s). If you are not the intended recipient of this email please let us know by reply and then delete it from your system; you should not copy this message or disclose its contents to anyone. Due to the integrity risk of sending emails over the Internet, Computacenter will accept no liability for any comments and / or attachments contained within this email.
========================== Disclaimer ==================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20111017/597b6fe3/attachment.html>


More information about the ewg mailing list