[openib-general] Performance Degradation with OFED v. Voltaire (lustre)
Hal Rosenstock
halr at voltaire.com
Fri Dec 15 11:14:58 PST 2006
On Fri, 2006-12-15 at 12:20, Eitan Zahavi wrote:
> Matt Leininger wrote:
> > On Fri, 2006-12-15 at 09:44 +0100, Bernadat, Philippe wrote:
> >
> >> I also looked at the HCA counters, and I indeed think
> >> there is something wrong about the MTU:
> >>
> >> For the same test
> >>
> >> With VIB
> >>
> >> PortXmitData: 2684490382
> >> PortRcvData: 1750145
> >> PortXmitPkts: 10280007
> >> PortRcvPkts: 49962
> >>
> >> With OFED
> >>
> >> XmtBytes:........................2653730483
> >> RcvBytes:........................1710541
> >> XmtPkts:.........................5160009
> >> RcvPkts:.........................50012
> >>
> >> Which means we sent half less packets with OFED
> >> and if you do the math it is 2K packets with OFED (counters are 32bit
> >> units)
> >> and 1K packets with VIB.
> >>
> >> So fo some reason the tavor_quirk param is ignored/overwriten.
> >> Is there an interface to control this ?
> >>
> >
> > Michael said you have to turn on this feature in OpenSM. From the
> > release notes I'm not sure how you turn it on in OpenSM. You did turn
> > on the tavor mtu work around in the rdma_cm, but did you turn it on in
> > OpenSM? Also what version of OpenSM are you running?
> >
> To turn this option on in opensm you need to:
> 1. Run: opensm -c -o
If you already have an opensm.opts file then you can skip this step.
-- Hal
> 2. Modify the file /var/cache/osm/opensm.opts by changing the line below
> enable_quirks FALSE
> to
> enable_quirks TRUE
>
> 3. Run: opensm
> > Thanks,
> >
> > - Matt
> >
> >
> >> Philippe
> >>
> >>
> >>> -----Original Message-----
> >>> From: Bernadat, Philippe
> >>> Sent: Friday, December 15, 2006 8:59 AM
> >>> To: Michael S. Tsirkin; Roland Dreier
> >>> Cc: Eitan Zahavi; Hal Rosenstock; openib-general at openib.org
> >>> Subject: RE: Performance Degradation with OFED v. Voltaire (lustre)
> >>>
> >>> I have set tavor_quirk to 1 with no effect.
> >>> Another thing I have tried is the same lustre
> >>> LNET echo test with a single thread (vs 8)
> >>>
> >>> VIB: 400 MB/s
> >>> OFED-1.1: 333 MB/s
> >>>
> >>> I am posting the live param values for all infiniband
> >>> modules in case someone could identify some wrong setting:
> >>>
> >>> infiniband/core/ib_cm
> >>>
> >>> mra_timeout_limit 30000
> >>>
> >>> infiniband/core/rdma_cm
> >>>
> >>> max_cm_retries 15
> >>> tavor_quirk 1
> >>>
> >>> infiniband/hw/ipath/ib_ipath
> >>>
> >>> cfgports 0
> >>> debug 1
> >>> disable_sma 0
> >>> kpiobufs 0
> >>> lkey_table_size 12
> >>> max_ahs 65535
> >>> max_cqes 196607
> >>> max_cqs 131071
> >>> max_mcast_grps 16384
> >>> max_mcast_qp_attached 16
> >>> max_pds 65535
> >>> max_qps 16384
> >>> max_qp_wrs 16383
> >>> max_sges 96
> >>> max_srqs 1024
> >>> max_srq_sges 128
> >>> max_srq_wrs 131071
> >>> qp_table_size 251
> >>>
> >>> infiniband/hw/mthca/ib_mthca
> >>>
> >>> catas_reset_disable 0
> >>> debug_level 0
> >>> fmr_reserved_mtts 262144
> >>> fw_cmd_doorbell 0
> >>> msi 0
> >>> msi_x 1
> >>> num_cq 65536
> >>> num_mcg 8192
> >>> num_mpt 131072
> >>> num_mtt 1048576
> >>> num_qp 65536
> >>> num_udav 32768
> >>> rdb_per_qp 4
> >>> tune_pci 1
> >>>
> >>> infiniband/ulp/ipoib/ib_ipoib
> >>>
> >>> debug_level 0
> >>> mcast_debug_level 0
> >>> recv_queue_size 128
> >>> send_queue_size 64
> >>>
> >>> Philippe
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Michael S. Tsirkin [mailto:mst at mellanox.co.il]
> >>>> Sent: Thursday, December 14, 2006 6:32 PM
> >>>> To: Roland Dreier
> >>>> Cc: Bernadat, Philippe; Eitan Zahavi; Hal Rosenstock;
> >>>> openib-general at openib.org
> >>>> Subject: Re: Performance Degradation with OFED v. Voltaire
> >>>>
> >>>>
> >>>>> > I think Eric described the major differences earlier on,
> >>>>>
> >>>> here it is, see
> >>>>
> >>>>> > second half:
> >>>>>
> >>>>> OK, I forgot about that.
> >>>>>
> >>>>> I guess one last thing to check would be the MTU being used
> >>>>>
> >>>> for the RC
> >>>>
> >>>>> connections. Since this is PCI-X HW then the MTU should
> >>>>>
> >>> be 1024 for
> >>>
> >>>>> best throughput (instead of the max MTU of 2048).
> >>>>>
> >>>> The MTU issue is described in the OFED release notes.
> >>>> You must turn the Tavor work-around for it on in opensm.
> >>>> This was introduced late in release cycle to it was deemed safer
> >>>> to make it off by default.
> >>>>
> >>>> By the way, Eitan, Hal, can we turn this on by default now?
> >>>> This was we'll get more feedback from people, and we'll still have
> >>>> time to turn it off before release if this unexpectedly
> >>>> creates issues.
> >>>>
> >>>> --
> >>>> MST
> >>>>
> >>>>
> >> _______________________________________________
> >> openib-general mailing list
> >> openib-general at openib.org
> >> http://openib.org/mailman/listinfo/openib-general
> >>
> >> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> >>
> >>
> >
> > _______________________________________________
> > openib-general mailing list
> > openib-general at openib.org
> > http://openib.org/mailman/listinfo/openib-general
> >
> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> >
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list