[ofa-general] XmtDiscards

Hal Rosenstock hrosenstock at xsigo.com
Mon Apr 7 08:29:00 PDT 2008


Hi again Bernd,

On Mon, 2008-04-07 at 15:53 +0200, Bernd Schubert wrote:
> Hello Hal,
> 
> On Monday 07 April 2008 15:35:10 Hal Rosenstock wrote:
> > Hi Bernd,
> >
> > On Sun, 2008-04-06 at 18:05 +0200, Bernd Schubert wrote:
> > > Hello Hal,
> > >
> > > > > Searching for this error I find "This is a symptom of congestion and
> > > > > may require tweaking either HOQ or switch lifetime values".
> > > > > Well, I have to admit I neither know what HOQ is, nor do I know how
> > > > > to tweak it. I also do not have an idea to set switch lifetime
> > > > > values.  I guess this isn't related to the opensm timeout option, is
> > > > > it?
> > > > >
> > > > > Hmm, I just found a cisci pdf describing how to set the lifetime on
> > > > > these switches, but is this also possible on Flextronics switches?
> > > >
> > > > What routing algorithm are you using ? Rather than play with those
> > > > switch values, if you are not using up/down, could you try that to see
> > > > if it helps with the congestion you are seeing ?
> > >
> > > I now configured up/down, but still got XmtDiscards, though, only on one
> > > port.
> > >
> > > Error check on lid 205 (SW_pfs1_leaf2) port all:  FAILED
> > > #warn: counter XmtDiscards = 6213       (threshold 100) lid 205 port 1
> > > Error check on lid 205 (SW_pfs1_leaf2) port 1:  FAILED
> > > #warn: counter RcvSwRelayErrors = 1431  (threshold 100) lid 205 port 13
> > > Error check on lid 205 (SW_pfs1_leaf2) port 13:  FAILED
> >
> > Are you running IPoIB ? If so, SwRelayErrors are not necessarily
> > indicative of a "real" issue due to the fact that multicasts reflected
> > on the same port are mistakenly counted.
> 
> so far only Lustre did IPoIB for network initialization. Once it finds a 
> working connection it does RDMA. But I'm not sure about what it does in case 
> of problems, e.g. server reboot, I guess it then does again IPoIB. 
> 
> Is there a way to find out if these RcvSwRelayErrors are due to multicast or 
> due to real problems?

While there're no counters which break this down into the 3 buckets
AFAIK, one can analyze that switch for the other 2 causes. That's the
best I'm aware of that can be done.

-- Hal

> > > I'm also not sure if up/down is the optimal algorithm for a fabric with
> > > only two switches.
> > >
> > > Since describing the connections in words is a bit difficult, I just
> > > upload a drawing here:
> > >
> > > http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/ib/Interswitch-ca
> > >bling.pdf
> > >
> > > The root-guid for the up/down algorithm is leaf-5 of of the small switch.
> > > But I'm still not sure about up/down at all. Doesn't one need for up/down
> > > at least 3 switches? Something like this ascii graphic below?
> > >
> > >
> > >        root-switch
> > >      /            \
> > >     /              \
> > >  Sw-1 ------------ Sw-2
> >
> > Doesn't your chassis switch have many switches in it ? You did say it
> > was 144 ports so it's made up of a number of switches.
> 
> Yes, it's made up of a number of switches.
> 
> >
> > You may need to choose a "better" root than up/down automatically
> > determines.
> >
> 
> Opensm isn't able to detect a root itself at all. As said above I first 
> configured leaf-5 of the small switch (see the pdf file above), but now 
> switched it to leaf-6 guid. I have no idea which would be optimal for our 
> switches - I guess I have to create a drawing from the ibnetdiscover output 
> to figure this out.

Yes.

> I will also later on try to check with ibutils if it detects errors.

Sure; that would be good too.

-- Hal

> Thanks,
> Bernd
> 
> 




More information about the general mailing list