[openib-general] Performance Degradation with OFED v. Voltaire(lustre)

Michael S. Tsirkin mst at mellanox.co.il
Mon Dec 18 03:37:50 PST 2006


cma quirk seems not to work.
Enabling the opensm quirk should work, and should be sufficient.
However, you seem to be running another SM on your fabric (on your switch?)
that's why it enters STANDBY. Disable that and try again.


Quoting r. Bernadat, Philippe <philippe_bernadat at hp.com>:
Subject: Re: Performance Degradation with OFED v. Voltaire(lustre)

I think I am going to need more help here.

I did use both tricks, opensm enable_quirks TRUE & rdma_cm
tavor_quirk=1.
This seems to have no effect.

But I may be doing something wrong. So some questions I have:

1) Doc (sdp_release_notes.txt, see below) says we can use either of the
two tricks, is it really the case ?

2) I usually don't run opensm (not required for me till now) and I am
not too familiar with it. But I did, so that I could try the
enable_quirks TRUE quirk option. Does opensm run in background, when I
run it never returns, last messages are:
    >>>> -------------------------------------------------
    >>>> OpenSM Rev:openib-2.0.5
    >>>> Based on OpenIB svn Exported revision
    >>>>  Using Cached Option:guid = 0x0008f10403961e4d
    >>>>  Using Cached Option:log_flags = 3
    >>>>  Using Cached Option:enable_quirks = TRUE
    >>>> Command Line Arguments:
    >>>>  Log File: /var/log/osm.log
    >>>> -------------------------------------------------
    >>>> OpenSM Rev:openib-2.0.5 OpenIB svn Exported revision
    >>>> 
    >>>> Entering STANDBY state

3) Is there a way to change the MTU from within the lustre LND kernel
module. I saw that the IB perf programs did this with the modify_qp()
APIs.

4) And by the way, I can confirm that the MTU is the issue. Forcing it
to 2K with the ib_witre_perf test also degrades performance.



Extract from sdp_release_notes.txt

- By default, SDP utilizes a 2 Kbyte MTU size.  This may cause PCI-X
cards
  using Mellanox Technologies "Infinihost" HCAs to experience low
bandwidth.
  Workaround:  reset the MTU size to 1K in this situation, using either
of
  the two methods below:

  1. Activate the "tavor quirk" workaround in opensm:
     a. Create an opensm options cache file
(/var/cache/osm/opensm.opts):
          > opensm --cache-options -o
     b. Add the following line to /var/cache/osm/opensm.opts:
          enable_quirks TRUE
     c. Rerun opensm using your usual command line options to activate
        the opensm quirk option.

  2. Activate the "tavor quirk" workaround in cma:
       set the tavor_quirk module parameter of the rdma_cm module to
value 1
       (default: 0).

Philippe

> -----Original Message-----
> From: Hal Rosenstock [mailto:halr at voltaire.com] 
> Sent: Friday, December 15, 2006 8:15 PM
> To: Eitan Zahavi
> Cc: Matt L. Leininger; Roland Dreier; Bernadat, Philippe; 
> openib-general at openib.org
> Subject: Re: [openib-general] Performance Degradation with 
> OFED v. Voltaire(lustre)
> 
> On Fri, 2006-12-15 at 12:20, Eitan Zahavi wrote:
> > Matt Leininger wrote:
> > > On Fri, 2006-12-15 at 09:44 +0100, Bernadat, Philippe wrote:
> > >   
> > >> I also looked at the HCA counters, and I indeed think 
> > >> there is something wrong about the MTU:
> > >>
> > >> For the same test
> > >>
> > >> With VIB
> > >>
> > >> PortXmitData:                  2684490382
> > >> PortRcvData:                      1750145
> > >> PortXmitPkts:                    10280007
> > >> PortRcvPkts:                        49962
> > >>
> > >> With OFED
> > >>
> > >> XmtBytes:........................2653730483
> > >> RcvBytes:........................1710541
> > >> XmtPkts:.........................5160009
> > >> RcvPkts:.........................50012
> > >>
> > >> Which means we sent half less packets with OFED 
> > >> and if you do the math it is 2K packets with OFED 
> (counters are 32bit
> > >> units)
> > >> and 1K packets with VIB.
> > >>
> > >> So fo some reason the tavor_quirk param is ignored/overwriten.
> > >> Is there an interface to control this ?
> > >>     
> > >
> > >   Michael said you have to turn on this feature in 
> OpenSM.  From the
> > > release notes I'm not sure how you turn it on in OpenSM.  
> You did turn
> > > on the tavor mtu work around in the rdma_cm, but did you 
> turn it on in
> > > OpenSM?  Also what version of OpenSM are you running?
> > >   
> > To turn this option on in opensm you need to:
> > 1. Run: opensm -c -o
> 
> If you already have an opensm.opts file then you can skip this step.
> 
> -- Hal
> 
> > 2. Modify the file /var/cache/osm/opensm.opts by changing 
> the line below
> > enable_quirks FALSE
> > to
> > enable_quirks TRUE
> > 
> > 3. Run: opensm
> > >   Thanks,
> > >
> > > 	- Matt
> > >
> > >   
> > >> Philippe
> > >>
> > >>     
> > >>> -----Original Message-----
> > >>> From: Bernadat, Philippe 
> > >>> Sent: Friday, December 15, 2006 8:59 AM
> > >>> To: Michael S. Tsirkin; Roland Dreier
> > >>> Cc: Eitan Zahavi; Hal Rosenstock; openib-general at openib.org
> > >>> Subject: RE: Performance Degradation with OFED v. 
> Voltaire (lustre)
> > >>>
> > >>> I have set tavor_quirk to 1 with no effect.
> > >>> Another thing I have tried is the same lustre 
> > >>> LNET echo test with a single thread (vs 8)
> > >>>
> > >>> VIB:      400 MB/s
> > >>> OFED-1.1: 333 MB/s
> > >>>
> > >>> I am posting the live param values for all infiniband 
> > >>> modules in case someone could identify some wrong setting:
> > >>>
> > >>> infiniband/core/ib_cm
> > >>>
> > >>> mra_timeout_limit              30000
> > >>>
> > >>> infiniband/core/rdma_cm
> > >>>
> > >>> max_cm_retries                    15
> > >>> tavor_quirk                        1
> > >>>
> > >>> infiniband/hw/ipath/ib_ipath
> > >>>
> > >>> cfgports                           0
> > >>> debug                              1
> > >>> disable_sma                        0
> > >>> kpiobufs                           0
> > >>> lkey_table_size                   12
> > >>> max_ahs                        65535
> > >>> max_cqes                      196607
> > >>> max_cqs                       131071
> > >>> max_mcast_grps                 16384
> > >>> max_mcast_qp_attached             16
> > >>> max_pds                        65535
> > >>> max_qps                        16384
> > >>> max_qp_wrs                     16383
> > >>> max_sges                          96
> > >>> max_srqs                        1024
> > >>> max_srq_sges                     128
> > >>> max_srq_wrs                   131071
> > >>> qp_table_size                    251
> > >>>
> > >>> infiniband/hw/mthca/ib_mthca
> > >>>
> > >>> catas_reset_disable                0
> > >>> debug_level                        0
> > >>> fmr_reserved_mtts             262144
> > >>> fw_cmd_doorbell                    0
> > >>> msi                                0
> > >>> msi_x                              1
> > >>> num_cq                         65536
> > >>> num_mcg                         8192
> > >>> num_mpt                       131072
> > >>> num_mtt                      1048576
> > >>> num_qp                         65536
> > >>> num_udav                       32768
> > >>> rdb_per_qp                         4
> > >>> tune_pci                           1
> > >>>
> > >>> infiniband/ulp/ipoib/ib_ipoib
> > >>>
> > >>> debug_level                        0
> > >>> mcast_debug_level                  0
> > >>> recv_queue_size                  128
> > >>> send_queue_size                   64
> > >>>
> > >>> Philippe
> > >>>
> > >>>       
> > >>>> -----Original Message-----
> > >>>> From: Michael S. Tsirkin [mailto:mst at mellanox.co.il] 
> > >>>> Sent: Thursday, December 14, 2006 6:32 PM
> > >>>> To: Roland Dreier
> > >>>> Cc: Bernadat, Philippe; Eitan Zahavi; Hal Rosenstock; 
> > >>>> openib-general at openib.org
> > >>>> Subject: Re: Performance Degradation with OFED v. Voltaire
> > >>>>
> > >>>>         
> > >>>>>  > I think Eric described the major differences earlier on, 
> > >>>>>           
> > >>>> here it is, see
> > >>>>         
> > >>>>>  > second half:
> > >>>>>
> > >>>>> OK, I forgot about that.
> > >>>>>
> > >>>>> I guess one last thing to check would be the MTU being used 
> > >>>>>           
> > >>>> for the RC
> > >>>>         
> > >>>>> connections.  Since this is PCI-X HW then the MTU should 
> > >>>>>           
> > >>> be 1024 for
> > >>>       
> > >>>>> best throughput (instead of the max MTU of 2048).
> > >>>>>           
> > >>>> The MTU issue is described in the OFED release notes.
> > >>>> You must turn the Tavor work-around for it on in opensm.
> > >>>> This was introduced late in release cycle to it was 
> deemed safer
> > >>>> to make it off by default.
> > >>>>
> > >>>> By the way, Eitan, Hal, can we turn this on by default now?
> > >>>> This was we'll get more feedback from people, and 
> we'll still have
> > >>>> time to turn it off before release if this unexpectedly 
> > >>>> creates issues.
> > >>>>
> > >>>> -- 
> > >>>> MST
> > >>>>
> > >>>>         
> > >> _______________________________________________
> > >> openib-general mailing list
> > >> openib-general at openib.org
> > >> http://openib.org/mailman/listinfo/openib-general
> > >>
> > >> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> > >>
> > >>     
> > >
> > > _______________________________________________
> > > openib-general mailing list
> > > openib-general at openib.org
> > > http://openib.org/mailman/listinfo/openib-general
> > >
> > > To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> > >   
> > 
> > 
> > _______________________________________________
> > openib-general mailing list
> > openib-general at openib.org
> > http://openib.org/mailman/listinfo/openib-general
> > 
> > To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> > 
> 
> 

_______________________________________________
openib-general mailing list
openib-general at openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-- 
MST




More information about the general mailing list