[Openib-windows] Interoperability of the current stack and theLinux 1.8.X stack

Tzachi Dar tzachid at mellanox.co.il
Mon Jul 10 10:46:23 PDT 2006


 

> -----Original Message-----
> From: Hal Rosenstock [mailto:halr at voltaire.com] 
> Sent: Monday, July 10, 2006 7:43 PM
> To: Fab Tillier
> Cc: Tzachi Dar; openib-windows at openib.org
> Subject: Re: [Openib-windows] Interoperability of the current 
> stack and theLinux 1.8.X stack
> 
> On Mon, 2006-07-10 at 09:57, Fabian Tillier wrote:
> > Hi Tzachi,
> > 
> > On 7/10/06, Tzachi Dar <tzachid at mellanox.co.il> wrote:
> > >
> > > Hi Fab,
> > >
> > > One of our customers has saw an issue that prevented the new code 
> > > from joining to clusters were old Linux installations exists.
> > 
> > Which Linux stack is this?  Any chance the customer could 
> upgrade to 
> > something less broken?  More below.
> > 
> > > The problem is related to joining to existing broadcast 
> groups on IPOIB.
> > >
> > > As the problem is in the old code, I have prepared a fix that is 
> > > dependent on a registry key and will not be run by default.
> > 
> > Why not just fix the old code?
> > 
> > > As for the second issue (parameters for joining a group 
> > > mcast_req.member_rec.mtu = 0; mcast_req.member_rec.rate = 0; ).
> 
> Don't you mean the component mask bits for those are 0 rather 
> than their values being 0 ? There are also other conditions 
> under which multicast groups are created.

In the IBAL implementation, the only way I found to set the component
masks
To 0 was to set the values themselves to 0.

> -- Hal
> 
> > > This is the way that Linux gen 2 is using, so I suggest 
> that we will 
> > > always use it like this.
> > >
> > > Thanks
> > > Tzachi
> > >
> > > Index: ipoib_port.c
> > > 
> ===================================================================
> > > --- ipoib_port.c (revision 1524)
> > > +++ ipoib_port.c (working copy)
> > > @@ -4813,7 +4813,8 @@
> > >   IPOIB_ENTER( IPOIB_DBG_MCAST );
> > >
> > >   /* Check that the rate is realizable for our port. */
> > > - if( p_port->ib_mgr.rate < (p_member_rec->rate & 0x3F) )
> > > + if( p_port->ib_mgr.rate < (p_member_rec->rate & 0x3F) &&  
> > > + (g_ipoib.bypass_check_bcast_rate == 0))
> > 
> > I'm confused by this change.  Why would it not be correct to check 
> > that the local port rate is greater than or equal to the broadcast 
> > group's rate?  This doesn't even seem like a stack issue, 
> but rather 
> > an SM issue.
> > 
> > >   {
> > >    /*
> > >     * The MC group rate is higher than our port's rate.  Log an 
> > > error @@ -4825,7 +4826,7 @@
> > >     EVENT_IPOIB_BCAST_RATE, 2,
> > >     (uint32_t)(p_member_rec->rate & 0x3F),
> > >     (uint32_t)p_port->ib_mgr.rate );
> > > -  return IB_ERROR;
> > > +   return IB_ERROR;
> > >   }
> > >
> > >   /* Join the broadcast group. */
> > > @@ -5226,6 +5227,8 @@
> > >   mcast_req.member_rec = p_port->ib_mgr.bcast_rec;
> > >   /* Clear fields that aren't specified in the join */
> > >   mcast_req.member_rec.mlid = 0;
> > > + mcast_req.member_rec.mtu = 0;
> > > + mcast_req.member_rec.rate = 0;
> > 
> > I don't see why this is needed.  The MTU and rate must be 
> the same as 
> > the broadcast group, so this would only be a problem if the 
> broadcast 
> > group was returned with invalid information.  Why is this a problem?
> > This again seems like an SM issue.
> > 
> > - Fab
> > 
> > _______________________________________________
> > openib-windows mailing list
> > openib-windows at openib.org
> > http://openib.org/mailman/listinfo/openib-windows
> > 
> 




More information about the ofw mailing list