[ewg] Re: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs query only single ports
Hal Rosenstock
hrosenstock at xsigo.com
Tue Dec 11 10:34:37 PST 2007
On Tue, 2007-12-11 at 16:46 +0000, Sasha Khapyorsky wrote:
> On 07:25 Tue 11 Dec , Hal Rosenstock wrote:
> > On Tue, 2007-12-11 at 15:27 +0000, Sasha Khapyorsky wrote:
> > > On 06:57 Tue 11 Dec , Hal Rosenstock wrote:
> > > > On Tue, 2007-12-11 at 13:46 +0000, Sasha Khapyorsky wrote:
> > > > > For CAs query performance counters only for single ports by lid and port
> > > > > number, and not whole node with 'all ports' option.
> > > >
> > > > Should the description also reference the bug # ?
> > >
> > > I will add.
> > >
> > > > Will a similar thing be done to the other diag scripts which have this
> > > > same issue (but haven't been reported yet) ?
> > >
> > > It is reasonable. I will try to check other scripts too.
> > >
> > > > Would it be better to fix this in the underlying tool used (perfquery)
> > > > and in that way address it for all the diag scripts ?
> > >
> > > I think perfquery could/should be improved as well, but it is not the
> > > same issue.
> >
> > Why not ?
> >
> > If perfquery paved over the lack of support for all ports, then all the
> > scripts would be fine as is, right ?
>
> Yes, but I think that it more accurate to query CA ports and not just
> nodes (even if 'all ports' option is supported).
Router ports need same handling as CA ports.
> > > I think that in general it is more accurate when whole
> > > fabric is checked to query endport's by port and not by node - multiport
> > > CA can have disconnected ports and/or ports which connected to another
> > > subnet - in this way its counters are irrelevant to the check. Right?
> >
> > Yes, but doing it on a node basis cuts down on the number of queries.
>
> True, but doing right things is more important here than number of
> queries IMO (BTW in practice the difference in number of queries is not
> so significant - it is in percents, not in times).
As long as switch ports aren't individually queried.
> > One can always go back and dive down to the port level after seeing
> > which nodes are of interest.
>
> The problem is that one can get invalid error report with such script -
> for example when CA has "bad" port which is connected to another subnet.
It is a bad port; just not in this particular subnet.
-- Hal
> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
More information about the ewg
mailing list