[openib-general] Performance Degradation with OFED v. Voltaire (lustre)
Matt Leininger
mlleinin at hpcn.ca.sandia.gov
Fri Dec 15 02:19:28 PST 2006
On Fri, 2006-12-15 at 09:44 +0100, Bernadat, Philippe wrote:
> I also looked at the HCA counters, and I indeed think
> there is something wrong about the MTU:
>
> For the same test
>
> With VIB
>
> PortXmitData: 2684490382
> PortRcvData: 1750145
> PortXmitPkts: 10280007
> PortRcvPkts: 49962
>
> With OFED
>
> XmtBytes:........................2653730483
> RcvBytes:........................1710541
> XmtPkts:.........................5160009
> RcvPkts:.........................50012
>
> Which means we sent half less packets with OFED
> and if you do the math it is 2K packets with OFED (counters are 32bit
> units)
> and 1K packets with VIB.
>
> So fo some reason the tavor_quirk param is ignored/overwriten.
> Is there an interface to control this ?
Michael said you have to turn on this feature in OpenSM. From the
release notes I'm not sure how you turn it on in OpenSM. You did turn
on the tavor mtu work around in the rdma_cm, but did you turn it on in
OpenSM? Also what version of OpenSM are you running?
Thanks,
- Matt
>
> Philippe
>
> > -----Original Message-----
> > From: Bernadat, Philippe
> > Sent: Friday, December 15, 2006 8:59 AM
> > To: Michael S. Tsirkin; Roland Dreier
> > Cc: Eitan Zahavi; Hal Rosenstock; openib-general at openib.org
> > Subject: RE: Performance Degradation with OFED v. Voltaire (lustre)
> >
> > I have set tavor_quirk to 1 with no effect.
> > Another thing I have tried is the same lustre
> > LNET echo test with a single thread (vs 8)
> >
> > VIB: 400 MB/s
> > OFED-1.1: 333 MB/s
> >
> > I am posting the live param values for all infiniband
> > modules in case someone could identify some wrong setting:
> >
> > infiniband/core/ib_cm
> >
> > mra_timeout_limit 30000
> >
> > infiniband/core/rdma_cm
> >
> > max_cm_retries 15
> > tavor_quirk 1
> >
> > infiniband/hw/ipath/ib_ipath
> >
> > cfgports 0
> > debug 1
> > disable_sma 0
> > kpiobufs 0
> > lkey_table_size 12
> > max_ahs 65535
> > max_cqes 196607
> > max_cqs 131071
> > max_mcast_grps 16384
> > max_mcast_qp_attached 16
> > max_pds 65535
> > max_qps 16384
> > max_qp_wrs 16383
> > max_sges 96
> > max_srqs 1024
> > max_srq_sges 128
> > max_srq_wrs 131071
> > qp_table_size 251
> >
> > infiniband/hw/mthca/ib_mthca
> >
> > catas_reset_disable 0
> > debug_level 0
> > fmr_reserved_mtts 262144
> > fw_cmd_doorbell 0
> > msi 0
> > msi_x 1
> > num_cq 65536
> > num_mcg 8192
> > num_mpt 131072
> > num_mtt 1048576
> > num_qp 65536
> > num_udav 32768
> > rdb_per_qp 4
> > tune_pci 1
> >
> > infiniband/ulp/ipoib/ib_ipoib
> >
> > debug_level 0
> > mcast_debug_level 0
> > recv_queue_size 128
> > send_queue_size 64
> >
> > Philippe
> >
> > > -----Original Message-----
> > > From: Michael S. Tsirkin [mailto:mst at mellanox.co.il]
> > > Sent: Thursday, December 14, 2006 6:32 PM
> > > To: Roland Dreier
> > > Cc: Bernadat, Philippe; Eitan Zahavi; Hal Rosenstock;
> > > openib-general at openib.org
> > > Subject: Re: Performance Degradation with OFED v. Voltaire
> > >
> > > > > I think Eric described the major differences earlier on,
> > > here it is, see
> > > > > second half:
> > > >
> > > > OK, I forgot about that.
> > > >
> > > > I guess one last thing to check would be the MTU being used
> > > for the RC
> > > > connections. Since this is PCI-X HW then the MTU should
> > be 1024 for
> > > > best throughput (instead of the max MTU of 2048).
> > >
> > > The MTU issue is described in the OFED release notes.
> > > You must turn the Tavor work-around for it on in opensm.
> > > This was introduced late in release cycle to it was deemed safer
> > > to make it off by default.
> > >
> > > By the way, Eitan, Hal, can we turn this on by default now?
> > > This was we'll get more feedback from people, and we'll still have
> > > time to turn it off before release if this unexpectedly
> > > creates issues.
> > >
> > > --
> > > MST
> > >
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list