[ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh

Jim Mott jim at mellanox.com
Thu Jan 24 09:46:42 PST 2008


I am really puzzled.  The majority of my testing has been between
Rhat4U4 and Rhat5.  Using netperf command lines of the form:
  netperf -C -c -P 0 -t TCP_RR -H 193.168.10.143 -l 60 ---r 64
  netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 1000000
and a process of:
  - set sdp_zcopy_thresh=0, run bandwidth test
  - set sdp_zcopy_thresh=size, run bandwidth test
I repeatedly get results that look like this:
     size     SDP     Bzcopy
    65536   7375.00   7515.98
   131072   7465.70   8105.58
  1000000   6541.87   9948.76

These numbers are from high end (2-socket, quad-core) machines.  When
you
use smaller machines, like the AMD dual-core shown below, the
differences
between SDP with and without bzcopy are more striking.

The process to start the netserver is:
  export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib
  export LD_PRELOAD=libsdp.so
  export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf
  netserver 

The process to start the netperf is similar:
  export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib
  export LD_PRELOAD=libsdp.so
  export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf
  netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 1000000

You and unload and reload ib_sdp between tests, but I just echo 0 and
echo size into sdp_zcopy_thresh on the sending side.  Note that it is 
in a different place on Rhat4u4 and Rhat5.

My libsdp.conf is the default that ships with OFED.  Stripping the
comments (grep -v), it is just:
  log min-level 9 destination file libsdp.log
  use both server * *:*
  use both client * *:*
Note that if you build locally:
  cd /tmp/openib_gen2/xxxx/ofa_1_3_dev_kernel
  make install
the libsdp.conf file seems to get lost.  You must restore it by
hand.

I have a shell script that automates this testing for a
wide range of message sizes:
  64 128 512 1024 2048 4096 8192 16000 32768 65536 131072 1000000
on multiple transports:
  IP		both	"echo datagram > /sys/class/net/ib0/mode"
  IP-CM	both  "echo connected > /sys/class/net/ib0/mode"
  SDP		both
  Bzcopy	TCP_STREAM
Where both is TCP_RR and TCP_STREAM testing.

The variance in SDP bandwidth results can be 10%-15% between runs.  The
difference between Bzcopy and non-Bzcopy is always very visible for 128K
and up tests though.

Could some other people please try to run some of these tests?  If only
help me know if I am crazy?

Thanks,
JIm

Jim Mott
Mellanox Technologies Ltd.
mail: jim at mellanox.com
Phone: 512-294-5481


-----Original Message-----
From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] 
Sent: Thursday, January 24, 2008 11:17 AM
To: Jim Mott; Weikuan Yu
Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org
Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance
changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh

I've tested on RHEL4 and RHEL5, and see no sdp_zcopy_thresh improvement
for any message size, as measured with netperf, for any Arbel or
ConnectX HCA.

Scott

 
> -----Original Message-----
> From: Jim Mott [mailto:jim at mellanox.com] 
> Sent: Thursday, January 24, 2008 7:57 AM
> To: Weikuan Yu; Scott Weitzenkamp (sweitzen)
> Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org
> Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP 
> performance changes inOFED 1.3 beta, and I get Oops when 
> enabling sdp_zcopy_thresh
> 
> Hi,
>   64K is borderline for seeing bzcopy effect.  Using an AMD 
> 6000+ (3 Ghz
> dual core) in Asus M2A-VM motherboard with ConnectX running 
> 2.3 firmware
> and OFED 1.3-rc3 stack running on 2.6.23.8 kernel.org kernel, 
> I ran the
> test for 128K:
>   5546  sdp_zcopy_thresh=0 (off)
>   8709  sdp_zcopy_thresh=65536
> 
> For these tests, I just have LD_PRELOAD set in my environment.
> 
> =======================
> 
> I see that TCP_MAXSEG is not being handled by libsdp and will 
> look into
> it.
> 
> 
> [root at dirk ~]# modprobe ib_sdp
> [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c
> -C -- -m 128K
> TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
> 193.168.10.198
> (193.168.10.198) port 0 AF_INET
> netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92
> Recv   Send    Send                          Utilization       Service
> Demand
> Socket Socket  Message  Elapsed              Send     Recv     Send
> Recv
> Size   Size    Size     Time     Throughput  local    remote   local
> remote
> bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB
> us/KB
> 
>  87380  16384 131072    30.01      5545.69   51.47    14.43    1.521
> 1.706  
> 
> Alignment      Offset         Bytes    Bytes       Sends   Bytes
> Recvs
> Local  Remote  Local  Remote  Xfered   Per                 Per
> Send   Recv    Send   Recv             Send (avg)          Recv (avg)
>     8       8      0       0 2.08e+10  131072.00    158690   33135.60
> 627718
> 
> Maximum
> Segment
> Size (bytes)
>     -1
> [root at dirk ~]# echo 65536
> >/sys/module/ib_sdp/parameters/sdp_zcopy_thresh 
> [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c
> -C -- -m 128K
> TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
> 193.168.10.198
> (193.168.10.198) port 0 AF_INET
> netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92
> Recv   Send    Send                          Utilization       Service
> Demand
> Socket Socket  Message  Elapsed              Send     Recv     Send
> Recv
> Size   Size    Size     Time     Throughput  local    remote   local
> remote
> bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB
> us/KB
> 
>  87380  16384 131072    30.01      8708.58   50.63    14.55    0.953
> 1.095  
> 
> Alignment      Offset         Bytes    Bytes       Sends   Bytes
> Recvs
> Local  Remote  Local  Remote  Xfered   Per                 Per
> Send   Recv    Send   Recv             Send (avg)          Recv (avg)
>     8       8      0       0 3.267e+10  131072.00    249228   26348.30
> 1239807
> 
> Maximum
> Segment
> Size (bytes)
>     -1
> 
> Thanks,
> JIm
> 
> Jim Mott
> Mellanox Technologies Ltd.
> mail: jim at mellanox.com
> Phone: 512-294-5481
> 
> 
> -----Original Message-----
> From: Weikuan Yu [mailto:weikuan.yu at gmail.com] 
> Sent: Thursday, January 24, 2008 9:09 AM
> To: Scott Weitzenkamp (sweitzen)
> Cc: Jim Mott; ewg at lists.openfabrics.org; general at lists.openfabrics.org
> Subject: Re: [ofa-general] RE: [ewg] Not seeing any SDP performance
> changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh
> 
> Hi, Scott,
> 
> I have been running SDP tests across two woodcrest nodes with 4x DDR 
> cards using OFED-1.2.5.4. The card/firmware info is below.
> 
> CA 'mthca0'
>          CA type: MT25208
>          Number of ports: 2
>          Firmware version: 5.1.400
>          Hardware version: a0
>          Node GUID: 0x0002c90200228e0c
>          System image GUID: 0x0002c90200228e0f
> 
> I could not get a bandwidth more than 5Gbps like you have shown here. 
> Wonder if I need to upgrade to the latest software or firmware? Any 
> suggestions?
> 
> Thanks,
> --Weikuan
> 
> 
> TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
> 192.168.225.77 
> (192.168
> .225.77) port 0 AF_INET
> Recv   Send    Send                          Utilization      
>  Service 
> Demand
> Socket Socket  Message  Elapsed              Send     Recv     Send
> Recv
> Size   Size    Size     Time     Throughput  local    remote   local 
> remote
> bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB
> us/KB
> 
> 131072 131072 131072    10.00      4918.95   21.29    24.99    1.418 
> 1.665
> 
> 
> Scott Weitzenkamp (sweitzen) wrote:
> > Jim,
> > 
> > I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU
> > (single core each CPU) Xeon system.  I do not see any performance
> > improvement (either throughput or CPU utilization) using 
> netperf when
> I
> > set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384.  Can you elaborate
> on
> > your HCA type, and performance improvement you see?
> > 
> > Here's an example netperf command line when using a Cheetah DDR HCA
> and
> > 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware
> too):
> > 
> > [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 -v2 -4 -H
> > 192.168.1.201 -l 30 -t TCP_STREAM -c -C --   -m 65536
> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
> 192.168.1.201
> > (192.168.1.201) port 0 AF_INET : histogram : demo
> > 
> > Recv   Send    Send                          Utilization    
>    Service
> > Demand
> > Socket Socket  Message  Elapsed              Send     Recv     Send
> > Recv
> > Size   Size    Size     Time     Throughput  local    remote   local
> > remote
> > bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB
> > us/KB
> > 
> >  87380  16384  65536    30.01      7267.70   55.06    61.27    1.241
> > 1.381 
> > 
> > Alignment      Offset         Bytes    Bytes       Sends   Bytes
> > Recvs
> > Local  Remote  Local  Remote  Xfered   Per                 Per
> > Send   Recv    Send   Recv             Send (avg)          
> Recv (avg)
> >     8       8      0       0 2.726e+10  65536.00    415942  
>  48106.01
> > 566648
> > 
> 



More information about the general mailing list