[openib-general] opensm fails to bring up subnet..

Hal Rosenstock halr at voltaire.com
Fri Jun 3 12:05:14 PDT 2005


On Fri, 2005-06-03 at 15:03, Troy Benjegerdes wrote:
> On Fri, Jun 03, 2005 at 01:52:31PM -0400, Hal Rosenstock wrote:
> > Hi Troy,
> > 
> > On Thu, 2005-06-02 at 19:23, Troy Benjegerdes wrote:
> > > I'm having intermittent problems with opensm.. It seems after a while
> > > IPoIB stops working and if I restart opensm, it starts spitting out
> > > errors. 
> > 
> > Please try the following workaround and let me know if this makes things
> > better.
> > 
> > -- Hal
> > 
> > Index: libvendor/osm_vendor_ibumad.c
> > ===================================================================
> > --- libvendor/osm_vendor_ibumad.c       (revision 2520)
> > +++ libvendor/osm_vendor_ibumad.c       (working copy)
> > @@ -402,7 +402,7 @@
> >  
> >         p_vend->p_log = p_log;
> >         p_vend->timeout = timeout;
> > -       p_vend->max_retries = OSM_DEFAULT_RETRY_COUNT;
> > +       p_vend->max_retries = 1;
> >  
> >         p_vend->umad_port_id = -1;
> >         p_vend->issmfd = -1;
> 
> No, it doesn't seem to help. To get anything to work at all, I seem to
> need to reload all the IB modules on every maching I want to use ipoib
> on.
> 
> There have been two times now I've been able to see about 4 ping
> packets, and then one of the arp entries seems to go away.
> 
> (On the sm machine, also the machine I am trying to ping)
> 10.40.5.213                      (incomplete)	ib0
> 
> (on another machine, trying to ping from..)
> 10.40.137.12		ether	00:00:04:04:FE:80:00C		ib0

That may be another issue. Are all your links active and the OpenSM
appears to be behaving better now ?

-- Hal





More information about the general mailing list