[openib-general] [PATCH] osm: fix num of blocks of GUIDInfo GetTable query

Hal Rosenstock halr at voltaire.com
Sat Jun 10 14:07:33 PDT 2006


On Sat, 2006-06-10 at 17:02, Eitan Zahavi wrote:
> Hi Hal,
> 
> When is a complete fix expected?
> Meanwhile osmtest on large enough cluster is not passing due to the huge
> number of GUID blocks...
> 
> If this full fix not anticipated soon can we have the simple fix applied
> first?

Sure. Let me know if this is also needed on the 1.0 branch.

-- Hal

> Eitan Zahavi
> Senior Engineering Director, Software Architect
> Mellanox Technologies LTD
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> P.O. Box 586 Yokneam 20692 ISRAEL
> 
> 
> > -----Original Message-----
> > From: Hal Rosenstock [mailto:halr at voltaire.com]
> > Sent: Saturday, June 10, 2006 9:11 PM
> > To: Eitan Zahavi
> > Cc: OPENIB
> > Subject: Re: [PATCH] osm: fix num of blocks of GUIDInfo GetTable query
> > 
> > Hi Eitan,
> > 
> > On Sat, 2006-06-10 at 13:12, Eitan Zahavi wrote:
> > > Hal Rosenstock wrote:
> > > > Hi Eitan,
> > > >
> > > > On Thu, 2006-06-08 at 07:24, Eitan Zahavi wrote:
> > > >
> > > >>Hi Hal
> > > >>
> > > >>I'm working on passing osmtest check. Found a bug in the new
> > > >>GUIDInfoRecord query: If you had a physical port with zero
> guid_cap
> > > >>the code would loop on blocks 0..255 instead of trying the next
> port.
> > > >
> > > >
> > > > OK; that's definitely a problem.
> > > >
> > > >
> > > >>I am still looking for why we might have a guid_cap == 0 on some
> > > >>ports.
> > > >
> > > >
> > > > PortInfo:GuidCap is not used for switch external ports.
> > > >
> > > >
> > > >>This patch resolves this new problem. osmtest passes on some
> arbitrary
> > > >>networks.
> > > >>
> > > >>Eitan
> > > >>
> > > >>Signed-off-by:  Eitan Zahavi <eitan at mellanox.co.il>
> > > >>
> > > >>Index: opensm/osm_sa_guidinfo_record.c
> > >
> >>===================================================================
> > > >>--- opensm/osm_sa_guidinfo_record.c	(revision 7703)
> > > >>+++ opensm/osm_sa_guidinfo_record.c	(working copy)
> > > >>@@ -255,6 +255,10 @@ __osm_sa_gir_create_gir(
> > > >>       continue;
> > > >>
> > > >>     p_pi = osm_physp_get_port_info_ptr( p_physp );
> > > >>+
> > > >>+    if ( p_pi->guid_cap == 0 )
> > > >>+      continue;
> > > >>+
> > > >
> > > >
> > > > I think the right fix is to detect switch external ports and use
> the
> > > > VLCap from port 0 rather than from the switch external port
> (unless that
> > > > concept is broken in which case it should return 0 records).
> > > I think switch external ports do not have any PortGUID assigned to
> them since
> > > they are not "end port" (i.e. addressable).
> > 
> > Right; that's what I said earlier in a different way (PortGUID is not
> > used for switch external ports).
> > 
> > > So I think this patch is good enough.
> > 
> > I think its better (an improvement) but not a complete fix for this
> > issue.
> > 
> > > What if a port reports guid_cap == 0?
> > 
> > Is that legal ? Shouldn't any port where GUIDCap is valid have a non
> > zero GUIDCap ? On any port where GUIDCap is not used (e.g. invalid),
> it
> > should be ignored.
> > 
> > > (I understand it is illegal for addressable port
> > > but for the SM it is probably better not to assume all ports are
> legal...)
> > 
> > That's my point on what a complete fix for this would include.
> > 
> > -- Hal
> > 
> > > EZ
> > > >
> > > > -- Hal
> > > >
> > > >
> > > >>     num_blocks = p_pi->guid_cap / 8;
> > > >>     if ( p_pi->guid_cap % 8 )
> > > >>       num_blocks++;
> > > >>
> > >





More information about the general mailing list