[ofa-general] RE: [ewg] Not seeing any SDP performance changes

Jim Mott jim at mellanox.com
Mon Feb 4 10:34:41 PST 2008


Hi,
  I am back in the office and have installed a fresh Rhat4U4 system on a
test machine that was running Rhat5.  The only non-default options I
used were:
  - No fireware
  - Disable SELinux

Then I built Netperf 2.4.3 on the new system. (./configure; make; make
install)

Then I downloaded and installed today's OFED 1.3 release.

At this point I have two identical hardware platforms running Rhat4U4
(2.6.9-42.ELsmp) kernel right off the install media.  They are both
running Netperf 2.4.3 and today's OFED stack.  Both are using ConnectX
cards with 2.3 firmware.

Running as root on both (netserver and netperf) sides my little shell
script pulled out the following bandwidth numbers:


              64K    128K      1M
  SDP      8215.17  6429.09  6862.66
  BZCOPY   8748.00  9997.07  9847.76

Looking at uS/KB transferred we see:

               64K         128K          1M
            LCL   RMT    LCL   RMT    LCL   RMT
  SDP      1.025 1.243  1.391 1.493  1.274 1.407
  BZCOPY   0.966 1.148  0.838 1.014  0.603 0.984

The output of "lspci -vv" for the HCA is:
0a:00.0 InfiniBand: Mellanox Technologies: Unknown device 634a (rev a0)
        Subsystem: Mellanox Technologies: Unknown device 634a
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size 10
        Interrupt: pin A routed to IRQ 169
        Region 0: Memory at b9300000 (64-bit, non-prefetchable)
[size=1M]
        Region 2: Memory at b8000000 (64-bit, prefetchable) [size=8M]
        Region 4: Memory at b9400000 (64-bit, non-prefetchable)
[size=8K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] Vital Product Data
        Capabilities: [9c] MSI-X: Enable+ Mask- TabSize=256
                Vector table: BAR=4 offset=00000000
                PBA: BAR=4 offset=00001000
        Capabilities: [60] Express Endpoint IRQ 0
                Device: Supported: MaxPayload 256 bytes, PhantFunc 0,
ExtTag+
                Device: Latency L0s <64ns, L1 unlimited
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable+ Non-Fatal+ Fatal+
Unsupported-
                Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
                Link: Supported Speed 2.5Gb/s, Width x8, ASPM L0s, Port
8
                Link: Latency L0s unlimited, L1 unlimited
                Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
                Link: Speed 2.5Gb/s, Width x8

So I know it is an 8x PCIe (gen1) slot, and I am running MaxReadReq=512.


To verify that it is not something strange in my script, I executed the
bandwidth commands by hand:

  # echo $LD_PRELOAD
  libsdp.so
  # echo 0 > /sys/module/ib_sdp/sdp_zcopy_thresh
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 64K
  87380 16384 16384 60.00 7106.72 13.32 14.87 1.228 1.370
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 128K
  87380 16384 16384 60.00 6906.18 14.02 15.18 1.330 1.441
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 1M
  87380 16384 16384 60.00 7030.98 13.97 15.13 1.303 1.410
  # echo 1 > /sys/module/ib_sdp/sdp_zcopy_thresh
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 64K
  87380 16384 16384 60.00 6491.93 13.83 14.90 1.396 1.504
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 128K
  87380 16384 16384 60.00 6536.61 14.19 14.80 1.423 1.484
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 1M
  87380 16384 16384 60.00 6623.94 13.68 14.82 1.353 1.466

Now these numbers look like what you report.  The problem here is that
we are giving SDP data in 16K chunks (Send Message Size bytes is 16384),
and the overhead of pinning 16K, sending it, and unpinning it is too
high to give us any benefit. 

Rerunning the whole test with -m instead of -r give me the numbers that
I keep reporting:
  # echo $LD_PRELOAD
  libsdp.so
  # echo 0 > /sys/module/ib_sdp/sdp_zcopy_thresh
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---m 64K
  87380  16384  65536  60.00  8323.20 12.96 13.64 1.020 1.074
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---m 128K
  87380  16384 131072  60.00  6661.77 13.74 13.41 1.352 1.320
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---m 1M
  87380  16384 1048576 60.00  6691.83 13.39 13.58 1.312 1.330
  # echo 1 > /sys/module/ib_sdp/sdp_zcopy_thresh
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---m 64K
  87380  16384  65536  60.00  9052.22 12.88 14.18 0.932 1.027
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---m 128K
  87380  16384 131072  60.00 10294.87 12.70 13.44 0.808 0.855
  # netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---m 1M
  87380  16384 1048576 60.00 10254.89  7.74 13.07 0.495 0.835

Maybe this is the problem?  My tests are giving sdp_sendmsg() enough
data to sink its teeth into.  When you send a big buffer, instead of
lots of little ones, you can see the benefit.  

Could you guys try the "-m size" instead of "-r size" and see if that
works better?

Thanks,
JIm

Jim Mott
Mellanox Technologies Ltd.
mail: jim at mellanox.com
Phone: 512-294-5481


-----Original Message-----
From: Jim Mott 
Sent: Friday, January 25, 2008 4:07 PM
To: 'Scott Weitzenkamp (sweitzen)'; Weikuan Yu
Cc: general at lists.openfabrics.org
Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance
changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh

Not today, but I will give it a shot next time I get a free machine.  I
have tested between Rhat4u4 MLX4 and Rhat4u4 mthca and seen the same
trend though.

Thanks,
JIm

Jim Mott
Mellanox Technologies Ltd.
mail: jim at mellanox.com
Phone: 512-294-5481


-----Original Message-----
From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] 
Sent: Friday, January 25, 2008 4:03 PM
To: Jim Mott; Weikuan Yu
Cc: general at lists.openfabrics.org
Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance
changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh

Is there any way you can make sender and receiver the same RHEL kernel?

> -----Original Message-----
> From: Jim Mott [mailto:jim at mellanox.com] 
> Sent: Friday, January 25, 2008 1:58 PM
> To: Scott Weitzenkamp (sweitzen); Weikuan Yu
> Cc: general at lists.openfabrics.org
> Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP 
> performance changes inOFED 1.3 beta, and I get Oops when 
> enabling sdp_zcopy_thresh
> 
> Receive side:
>   - 2.6.23.8 kernel.org kernel on Rhat5 distro
>   - HCA is MLX4 with 2.3.914
>     I get the same number on released 2.3 firmware
> 
> Send side:
>   - 2.6.9-42.ELsmp x86_64 (Rhat4u4)
>   - HCA is MLX4 with 2.3.914
> 
> I get the same trends (SDP < BZCOPY if message_size > 64K) on 
> unmodifed
> Rhat5, Rhat4u4, and SLES10-SP1-RT distros.  I also see it on 
> kernel.org
> kernels 2.6.23.12, 2.6.24-rc2, 2.6.23, and 2.6.22.9.  I am in 
> the midst
> of testing some things, so I do not have all the machines available
> right now to repeat most of the tests though.
> 
> 
> Thanks,
> JIm
> 
> Jim Mott
> Mellanox Technologies Ltd.
> mail: jim at mellanox.com
> Phone: 512-294-5481
> 
> 
> -----Original Message-----
> From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] 
> Sent: Friday, January 25, 2008 3:39 PM
> To: Jim Mott; Weikuan Yu
> Cc: general at lists.openfabrics.org
> Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance
> changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh
> 
> Jim, what kernel and HCA are these numbers for?
> 
> Scott
> 
>  
> 
> > -----Original Message-----
> > From: Jim Mott [mailto:jim at mellanox.com] 
> > Sent: Friday, January 25, 2008 11:09 AM
> > To: Scott Weitzenkamp (sweitzen); Weikuan Yu
> > Cc: general at lists.openfabrics.org
> > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP 
> > performance changes inOFED 1.3 beta, and I get Oops when 
> > enabling sdp_zcopy_thresh
> > 
> > Right you are (as usual).
> > 
> > Hunting around these systems shows that I have been using 
> > netperf-2.4.3
> > for testing.  No configuration options; just ./configure; make; make
> > install.
> > 
> > To try and understand version differences, I installed 2.4.1 (your
> > version?), 2.4.3, and 2.4.4.  Built them with default 
> options and ran
> > the tests using each.
> > 
> > Using netperf-2.4.1 and reran "netperf -v2 -4 -H 
> > 193.168.10.143 -l 30 -t
> > TCP_STREAM -c -C --   -m size" with target AMD and driver as 
> > 8-processor
> > Intel:
> > 
> >             64K    128K      1M
> > SDP      7749.66  6925.68  6281.17
> > BZCOPY   8492.85  9867.06 11105.50
> > 
> > I tried running these tests a few times and saw a lot of 
> > variance in the
> > reported results.  Reloading 2.4.3 and running the same tests:
> > 
> >             64K    128K      1M
> > SDP      7553.77  6747.58  5986.42  
> > BZCOPY   8839.46  9572.49 10654.52
> > 
> > and finally, I tried 2.4.4 and running the same tests:
> > 
> >             64K    128K      1M
> > SDP      7935.97  6325.69  7682.65
> > BZCOPY   8905.94  9935.45 10615.03
> > 
> > At this point, I am confused.  The difference between SDP with and
> > without Bzcopy is obvious in all three sets of numbers.  I can not
> > explain why you see something different.  
> > 
> > If you could try a vanilla netperf build, it would be 
> > interesting to see
> > if you get any different results.
> > 
> > Thanks,
> > JIm
> > 
> > Jim Mott
> > Mellanox Technologies Ltd.
> > mail: jim at mellanox.com
> > Phone: 512-294-5481
> > 
> > 
> > -----Original Message-----
> > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] 
> > Sent: Friday, January 25, 2008 10:36 AM
> > To: Jim Mott; Jim Mott; Weikuan Yu
> > Cc: general at lists.openfabrics.org
> > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance
> > changes inOFED 1.3 beta, and I get Oops when enabling 
> sdp_zcopy_thresh
> > 
> > > So I see your results (sort of).  I have been using the 
> > > netperf that ships with the OS (Rhat4u4 and Rhat5 mostly) or 
> > > is built with
> > > default options.  Maybe that is the difference.
> > 
> > Jim, AFAIK Red Hat does not ship netperf with RHEL.
> > 
> > Scott
> > 
> 



More information about the general mailing list