[ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh

Scott Weitzenkamp (sweitzen) sweitzen at cisco.com
Thu Jan 24 13:58:20 PST 2008


Jim,

Like I've said before, I don't see any change in throughput with SDP
zcopy, plus the throughput bounces around.  When you run netperf with
-D, do you see variations in throughput?

Here's example from a dual socket Xeon 5355 quad core RHEL5 x86_64
system with ConnectX 2.3.0 firmware, the interim results throughput
bounces around between 4-7 Gbps.

[releng at svbu-qaclus-98 ~]$ cat
/sys/module/ib_sdp/parameters/sdp_zcopy_thresh
16384
[releng at svbu-qaclus-98 ~]$ LD_PRELOAD=libsdp.so netperf241 -C -c -P 0 -t
TCP_STREAM -H 192.168.1.127 -D -l 60 -- -m 1000000
Interim result: 6676.50 10^6bits/s over 1.00 seconds
Interim result: 6674.47 10^6bits/s over 1.00 seconds
Interim result: 6687.89 10^6bits/s over 1.00 seconds
Interim result: 7075.40 10^6bits/s over 1.00 seconds
Interim result: 7065.08 10^6bits/s over 1.00 seconds
Interim result: 7074.69 10^6bits/s over 1.00 seconds
Interim result: 6667.10 10^6bits/s over 1.06 seconds
Interim result: 4492.29 10^6bits/s over 1.48 seconds
Interim result: 4503.65 10^6bits/s over 1.00 seconds
Interim result: 4481.25 10^6bits/s over 1.01 seconds
Interim result: 4495.91 10^6bits/s over 1.00 seconds
Interim result: 4521.51 10^6bits/s over 1.00 seconds
Interim result: 4466.58 10^6bits/s over 1.01 seconds
Interim result: 4482.09 10^6bits/s over 1.00 seconds
Interim result: 4480.21 10^6bits/s over 1.00 seconds
Interim result: 4490.07 10^6bits/s over 1.00 seconds
Interim result: 4479.47 10^6bits/s over 1.00 seconds
Interim result: 4480.30 10^6bits/s over 1.00 seconds
Interim result: 4489.14 10^6bits/s over 1.00 seconds
Interim result: 4484.38 10^6bits/s over 1.00 seconds
Interim result: 4473.64 10^6bits/s over 1.00 seconds
Interim result: 4479.71 10^6bits/s over 1.00 seconds
Interim result: 4486.54 10^6bits/s over 1.00 seconds
Interim result: 4456.65 10^6bits/s over 1.01 seconds
Interim result: 4483.70 10^6bits/s over 1.00 seconds
Interim result: 4486.41 10^6bits/s over 1.00 seconds
Interim result: 4489.58 10^6bits/s over 1.00 seconds
Interim result: 4478.15 10^6bits/s over 1.00 seconds
Interim result: 4476.67 10^6bits/s over 1.00 seconds
Interim result: 4496.49 10^6bits/s over 1.00 seconds
Interim result: 4489.26 10^6bits/s over 1.00 seconds
Interim result: 4479.86 10^6bits/s over 1.00 seconds
Interim result: 4500.97 10^6bits/s over 1.00 seconds
Interim result: 4473.96 10^6bits/s over 1.00 seconds
Interim result: 7346.56 10^6bits/s over 1.00 seconds
Interim result: 7524.94 10^6bits/s over 1.00 seconds
Interim result: 7540.16 10^6bits/s over 1.00 seconds
Interim result: 7553.53 10^6bits/s over 1.00 seconds
Interim result: 7552.08 10^6bits/s over 1.00 seconds
Interim result: 7550.08 10^6bits/s over 1.00 seconds
Interim result: 7554.35 10^6bits/s over 1.00 seconds
Interim result: 7550.85 10^6bits/s over 1.00 seconds
Interim result: 7557.27 10^6bits/s over 1.00 seconds
Interim result: 7568.28 10^6bits/s over 1.00 seconds
Interim result: 7497.24 10^6bits/s over 1.01 seconds
Interim result: 7436.44 10^6bits/s over 1.01 seconds
Interim result: 6098.26 10^6bits/s over 1.22 seconds
Interim result: 5644.82 10^6bits/s over 1.08 seconds
Interim result: 5639.07 10^6bits/s over 1.00 seconds
Interim result: 5636.32 10^6bits/s over 1.00 seconds
Interim result: 5640.45 10^6bits/s over 1.00 seconds
Interim result: 6319.06 10^6bits/s over 1.00 seconds
Interim result: 7324.10 10^6bits/s over 1.00 seconds
Interim result: 7323.53 10^6bits/s over 1.00 seconds
Interim result: 7333.88 10^6bits/s over 1.00 seconds
Interim result: 7172.70 10^6bits/s over 1.02 seconds
Interim result: 4488.97 10^6bits/s over 1.60 seconds
Interim result: 4492.37 10^6bits/s over 1.00 seconds
 87380  16384 1000000    60.00      5701.15   17.41    16.26    2.001
1.870

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems


 

> -----Original Message-----
> From: Jim Mott [mailto:jim at mellanox.com] 
> Sent: Thursday, January 24, 2008 9:47 AM
> To: Scott Weitzenkamp (sweitzen); Weikuan Yu
> Cc: general at lists.openfabrics.org
> Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP 
> performance changes inOFED 1.3 beta, and I get Oops when 
> enabling sdp_zcopy_thresh
> 
> I am really puzzled.  The majority of my testing has been between
> Rhat4U4 and Rhat5.  Using netperf command lines of the form:
>   netperf -C -c -P 0 -t TCP_RR -H 193.168.10.143 -l 60 ---r 64
>   netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 
> ---r 1000000
> and a process of:
>   - set sdp_zcopy_thresh=0, run bandwidth test
>   - set sdp_zcopy_thresh=size, run bandwidth test
> I repeatedly get results that look like this:
>      size     SDP     Bzcopy
>     65536   7375.00   7515.98
>    131072   7465.70   8105.58
>   1000000   6541.87   9948.76
> 
> These numbers are from high end (2-socket, quad-core) machines.  When
> you
> use smaller machines, like the AMD dual-core shown below, the
> differences
> between SDP with and without bzcopy are more striking.
> 
> The process to start the netserver is:
>   export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib
>   export LD_PRELOAD=libsdp.so
>   export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf
>   netserver 
> 
> The process to start the netperf is similar:
>   export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib
>   export LD_PRELOAD=libsdp.so
>   export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf
>   netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 
> ---r 1000000
> 
> You and unload and reload ib_sdp between tests, but I just echo 0 and
> echo size into sdp_zcopy_thresh on the sending side.  Note that it is 
> in a different place on Rhat4u4 and Rhat5.
> 
> My libsdp.conf is the default that ships with OFED.  Stripping the
> comments (grep -v), it is just:
>   log min-level 9 destination file libsdp.log
>   use both server * *:*
>   use both client * *:*
> Note that if you build locally:
>   cd /tmp/openib_gen2/xxxx/ofa_1_3_dev_kernel
>   make install
> the libsdp.conf file seems to get lost.  You must restore it by
> hand.
> 
> I have a shell script that automates this testing for a
> wide range of message sizes:
>   64 128 512 1024 2048 4096 8192 16000 32768 65536 131072 1000000
> on multiple transports:
>   IP		both	"echo datagram > /sys/class/net/ib0/mode"
>   IP-CM	both  "echo connected > /sys/class/net/ib0/mode"
>   SDP		both
>   Bzcopy	TCP_STREAM
> Where both is TCP_RR and TCP_STREAM testing.
> 
> The variance in SDP bandwidth results can be 10%-15% between 
> runs.  The
> difference between Bzcopy and non-Bzcopy is always very 
> visible for 128K
> and up tests though.
> 
> Could some other people please try to run some of these 
> tests?  If only
> help me know if I am crazy?
> 
> Thanks,
> JIm
> 
> Jim Mott
> Mellanox Technologies Ltd.
> mail: jim at mellanox.com
> Phone: 512-294-5481
> 
> 
> -----Original Message-----
> From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] 
> Sent: Thursday, January 24, 2008 11:17 AM
> To: Jim Mott; Weikuan Yu
> Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org
> Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance
> changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh
> 
> I've tested on RHEL4 and RHEL5, and see no sdp_zcopy_thresh 
> improvement
> for any message size, as measured with netperf, for any Arbel or
> ConnectX HCA.
> 
> Scott
> 
>  
> > -----Original Message-----
> > From: Jim Mott [mailto:jim at mellanox.com] 
> > Sent: Thursday, January 24, 2008 7:57 AM
> > To: Weikuan Yu; Scott Weitzenkamp (sweitzen)
> > Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org
> > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP 
> > performance changes inOFED 1.3 beta, and I get Oops when 
> > enabling sdp_zcopy_thresh
> > 
> > Hi,
> >   64K is borderline for seeing bzcopy effect.  Using an AMD 
> > 6000+ (3 Ghz
> > dual core) in Asus M2A-VM motherboard with ConnectX running 
> > 2.3 firmware
> > and OFED 1.3-rc3 stack running on 2.6.23.8 kernel.org kernel, 
> > I ran the
> > test for 128K:
> >   5546  sdp_zcopy_thresh=0 (off)
> >   8709  sdp_zcopy_thresh=65536
> > 
> > For these tests, I just have LD_PRELOAD set in my environment.
> > 
> > =======================
> > 
> > I see that TCP_MAXSEG is not being handled by libsdp and will 
> > look into
> > it.
> > 
> > 
> > [root at dirk ~]# modprobe ib_sdp
> > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t 
> TCP_STREAM -c
> > -C -- -m 128K
> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
> > 193.168.10.198
> > (193.168.10.198) port 0 AF_INET
> > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92
> > Recv   Send    Send                          Utilization    
>    Service
> > Demand
> > Socket Socket  Message  Elapsed              Send     Recv     Send
> > Recv
> > Size   Size    Size     Time     Throughput  local    remote   local
> > remote
> > bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB
> > us/KB
> > 
> >  87380  16384 131072    30.01      5545.69   51.47    14.43    1.521
> > 1.706  
> > 
> > Alignment      Offset         Bytes    Bytes       Sends   Bytes
> > Recvs
> > Local  Remote  Local  Remote  Xfered   Per                 Per
> > Send   Recv    Send   Recv             Send (avg)          
> Recv (avg)
> >     8       8      0       0 2.08e+10  131072.00    158690  
>  33135.60
> > 627718
> > 
> > Maximum
> > Segment
> > Size (bytes)
> >     -1
> > [root at dirk ~]# echo 65536
> > >/sys/module/ib_sdp/parameters/sdp_zcopy_thresh 
> > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t 
> TCP_STREAM -c
> > -C -- -m 128K
> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
> > 193.168.10.198
> > (193.168.10.198) port 0 AF_INET
> > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92
> > Recv   Send    Send                          Utilization    
>    Service
> > Demand
> > Socket Socket  Message  Elapsed              Send     Recv     Send
> > Recv
> > Size   Size    Size     Time     Throughput  local    remote   local
> > remote
> > bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB
> > us/KB
> > 
> >  87380  16384 131072    30.01      8708.58   50.63    14.55    0.953
> > 1.095  
> > 
> > Alignment      Offset         Bytes    Bytes       Sends   Bytes
> > Recvs
> > Local  Remote  Local  Remote  Xfered   Per                 Per
> > Send   Recv    Send   Recv             Send (avg)          
> Recv (avg)
> >     8       8      0       0 3.267e+10  131072.00    249228 
>   26348.30
> > 1239807
> > 
> > Maximum
> > Segment
> > Size (bytes)
> >     -1
> > 
> > Thanks,
> > JIm
> > 
> > Jim Mott
> > Mellanox Technologies Ltd.
> > mail: jim at mellanox.com
> > Phone: 512-294-5481
> > 
> > 
> > -----Original Message-----
> > From: Weikuan Yu [mailto:weikuan.yu at gmail.com] 
> > Sent: Thursday, January 24, 2008 9:09 AM
> > To: Scott Weitzenkamp (sweitzen)
> > Cc: Jim Mott; ewg at lists.openfabrics.org; 
> general at lists.openfabrics.org
> > Subject: Re: [ofa-general] RE: [ewg] Not seeing any SDP performance
> > changes inOFED 1.3 beta, and I get Oops when enabling 
> sdp_zcopy_thresh
> > 
> > Hi, Scott,
> > 
> > I have been running SDP tests across two woodcrest nodes 
> with 4x DDR 
> > cards using OFED-1.2.5.4. The card/firmware info is below.
> > 
> > CA 'mthca0'
> >          CA type: MT25208
> >          Number of ports: 2
> >          Firmware version: 5.1.400
> >          Hardware version: a0
> >          Node GUID: 0x0002c90200228e0c
> >          System image GUID: 0x0002c90200228e0f
> > 
> > I could not get a bandwidth more than 5Gbps like you have 
> shown here. 
> > Wonder if I need to upgrade to the latest software or firmware? Any 
> > suggestions?
> > 
> > Thanks,
> > --Weikuan
> > 
> > 
> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
> > 192.168.225.77 
> > (192.168
> > .225.77) port 0 AF_INET
> > Recv   Send    Send                          Utilization      
> >  Service 
> > Demand
> > Socket Socket  Message  Elapsed              Send     Recv     Send
> > Recv
> > Size   Size    Size     Time     Throughput  local    
> remote   local 
> > remote
> > bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB
> > us/KB
> > 
> > 131072 131072 131072    10.00      4918.95   21.29    24.99 
>    1.418 
> > 1.665
> > 
> > 
> > Scott Weitzenkamp (sweitzen) wrote:
> > > Jim,
> > > 
> > > I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU
> > > (single core each CPU) Xeon system.  I do not see any performance
> > > improvement (either throughput or CPU utilization) using 
> > netperf when
> > I
> > > set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384.  Can 
> you elaborate
> > on
> > > your HCA type, and performance improvement you see?
> > > 
> > > Here's an example netperf command line when using a 
> Cheetah DDR HCA
> > and
> > > 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware
> > too):
> > > 
> > > [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 
> -v2 -4 -H
> > > 192.168.1.201 -l 30 -t TCP_STREAM -c -C --   -m 65536
> > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
> > 192.168.1.201
> > > (192.168.1.201) port 0 AF_INET : histogram : demo
> > > 
> > > Recv   Send    Send                          Utilization    
> >    Service
> > > Demand
> > > Socket Socket  Message  Elapsed              Send     
> Recv     Send
> > > Recv
> > > Size   Size    Size     Time     Throughput  local    
> remote   local
> > > remote
> > > bytes  bytes   bytes    secs.    10^6bits/s  % S      % S 
>      us/KB
> > > us/KB
> > > 
> > >  87380  16384  65536    30.01      7267.70   55.06    
> 61.27    1.241
> > > 1.381 
> > > 
> > > Alignment      Offset         Bytes    Bytes       Sends   Bytes
> > > Recvs
> > > Local  Remote  Local  Remote  Xfered   Per                 Per
> > > Send   Recv    Send   Recv             Send (avg)          
> > Recv (avg)
> > >     8       8      0       0 2.726e+10  65536.00    415942  
> >  48106.01
> > > 566648
> > > 
> > 
> 



More information about the general mailing list