[openib-general] Performance Degradation with OFED v. Voltaire (lustre)

Bernadat, Philippe philippe_bernadat at hp.com
Fri Dec 15 00:44:14 PST 2006


I also looked at the HCA counters, and I indeed think 
there is something wrong about the MTU:

For the same test

With VIB

PortXmitData:                  2684490382
PortRcvData:                      1750145
PortXmitPkts:                    10280007
PortRcvPkts:                        49962

With OFED

XmtBytes:........................2653730483
RcvBytes:........................1710541
XmtPkts:.........................5160009
RcvPkts:.........................50012

Which means we sent half less packets with OFED 
and if you do the math it is 2K packets with OFED (counters are 32bit
units)
and 1K packets with VIB.

So fo some reason the tavor_quirk param is ignored/overwriten.
Is there an interface to control this ?

Philippe

> -----Original Message-----
> From: Bernadat, Philippe 
> Sent: Friday, December 15, 2006 8:59 AM
> To: Michael S. Tsirkin; Roland Dreier
> Cc: Eitan Zahavi; Hal Rosenstock; openib-general at openib.org
> Subject: RE: Performance Degradation with OFED v. Voltaire (lustre)
> 
> I have set tavor_quirk to 1 with no effect.
> Another thing I have tried is the same lustre 
> LNET echo test with a single thread (vs 8)
> 
> VIB:      400 MB/s
> OFED-1.1: 333 MB/s
> 
> I am posting the live param values for all infiniband 
> modules in case someone could identify some wrong setting:
> 
> infiniband/core/ib_cm
> 
> mra_timeout_limit              30000
> 
> infiniband/core/rdma_cm
> 
> max_cm_retries                    15
> tavor_quirk                        1
> 
> infiniband/hw/ipath/ib_ipath
> 
> cfgports                           0
> debug                              1
> disable_sma                        0
> kpiobufs                           0
> lkey_table_size                   12
> max_ahs                        65535
> max_cqes                      196607
> max_cqs                       131071
> max_mcast_grps                 16384
> max_mcast_qp_attached             16
> max_pds                        65535
> max_qps                        16384
> max_qp_wrs                     16383
> max_sges                          96
> max_srqs                        1024
> max_srq_sges                     128
> max_srq_wrs                   131071
> qp_table_size                    251
> 
> infiniband/hw/mthca/ib_mthca
> 
> catas_reset_disable                0
> debug_level                        0
> fmr_reserved_mtts             262144
> fw_cmd_doorbell                    0
> msi                                0
> msi_x                              1
> num_cq                         65536
> num_mcg                         8192
> num_mpt                       131072
> num_mtt                      1048576
> num_qp                         65536
> num_udav                       32768
> rdb_per_qp                         4
> tune_pci                           1
> 
> infiniband/ulp/ipoib/ib_ipoib
> 
> debug_level                        0
> mcast_debug_level                  0
> recv_queue_size                  128
> send_queue_size                   64
> 
> Philippe
> 
> > -----Original Message-----
> > From: Michael S. Tsirkin [mailto:mst at mellanox.co.il] 
> > Sent: Thursday, December 14, 2006 6:32 PM
> > To: Roland Dreier
> > Cc: Bernadat, Philippe; Eitan Zahavi; Hal Rosenstock; 
> > openib-general at openib.org
> > Subject: Re: Performance Degradation with OFED v. Voltaire
> > 
> > >  > I think Eric described the major differences earlier on, 
> > here it is, see
> > >  > second half:
> > > 
> > > OK, I forgot about that.
> > > 
> > > I guess one last thing to check would be the MTU being used 
> > for the RC
> > > connections.  Since this is PCI-X HW then the MTU should 
> be 1024 for
> > > best throughput (instead of the max MTU of 2048).
> > 
> > The MTU issue is described in the OFED release notes.
> > You must turn the Tavor work-around for it on in opensm.
> > This was introduced late in release cycle to it was deemed safer
> > to make it off by default.
> > 
> > By the way, Eitan, Hal, can we turn this on by default now?
> > This was we'll get more feedback from people, and we'll still have
> > time to turn it off before release if this unexpectedly 
> > creates issues.
> > 
> > -- 
> > MST
> > 




More information about the general mailing list