[Openib-windows] Interoperability of the current stack and theLinux 1.8.X stack
Tzachi Dar
tzachid at mellanox.co.il
Mon Jul 10 10:46:23 PDT 2006
> -----Original Message-----
> From: Hal Rosenstock [mailto:halr at voltaire.com]
> Sent: Monday, July 10, 2006 7:43 PM
> To: Fab Tillier
> Cc: Tzachi Dar; openib-windows at openib.org
> Subject: Re: [Openib-windows] Interoperability of the current
> stack and theLinux 1.8.X stack
>
> On Mon, 2006-07-10 at 09:57, Fabian Tillier wrote:
> > Hi Tzachi,
> >
> > On 7/10/06, Tzachi Dar <tzachid at mellanox.co.il> wrote:
> > >
> > > Hi Fab,
> > >
> > > One of our customers has saw an issue that prevented the new code
> > > from joining to clusters were old Linux installations exists.
> >
> > Which Linux stack is this? Any chance the customer could
> upgrade to
> > something less broken? More below.
> >
> > > The problem is related to joining to existing broadcast
> groups on IPOIB.
> > >
> > > As the problem is in the old code, I have prepared a fix that is
> > > dependent on a registry key and will not be run by default.
> >
> > Why not just fix the old code?
> >
> > > As for the second issue (parameters for joining a group
> > > mcast_req.member_rec.mtu = 0; mcast_req.member_rec.rate = 0; ).
>
> Don't you mean the component mask bits for those are 0 rather
> than their values being 0 ? There are also other conditions
> under which multicast groups are created.
In the IBAL implementation, the only way I found to set the component
masks
To 0 was to set the values themselves to 0.
> -- Hal
>
> > > This is the way that Linux gen 2 is using, so I suggest
> that we will
> > > always use it like this.
> > >
> > > Thanks
> > > Tzachi
> > >
> > > Index: ipoib_port.c
> > >
> ===================================================================
> > > --- ipoib_port.c (revision 1524)
> > > +++ ipoib_port.c (working copy)
> > > @@ -4813,7 +4813,8 @@
> > > IPOIB_ENTER( IPOIB_DBG_MCAST );
> > >
> > > /* Check that the rate is realizable for our port. */
> > > - if( p_port->ib_mgr.rate < (p_member_rec->rate & 0x3F) )
> > > + if( p_port->ib_mgr.rate < (p_member_rec->rate & 0x3F) &&
> > > + (g_ipoib.bypass_check_bcast_rate == 0))
> >
> > I'm confused by this change. Why would it not be correct to check
> > that the local port rate is greater than or equal to the broadcast
> > group's rate? This doesn't even seem like a stack issue,
> but rather
> > an SM issue.
> >
> > > {
> > > /*
> > > * The MC group rate is higher than our port's rate. Log an
> > > error @@ -4825,7 +4826,7 @@
> > > EVENT_IPOIB_BCAST_RATE, 2,
> > > (uint32_t)(p_member_rec->rate & 0x3F),
> > > (uint32_t)p_port->ib_mgr.rate );
> > > - return IB_ERROR;
> > > + return IB_ERROR;
> > > }
> > >
> > > /* Join the broadcast group. */
> > > @@ -5226,6 +5227,8 @@
> > > mcast_req.member_rec = p_port->ib_mgr.bcast_rec;
> > > /* Clear fields that aren't specified in the join */
> > > mcast_req.member_rec.mlid = 0;
> > > + mcast_req.member_rec.mtu = 0;
> > > + mcast_req.member_rec.rate = 0;
> >
> > I don't see why this is needed. The MTU and rate must be
> the same as
> > the broadcast group, so this would only be a problem if the
> broadcast
> > group was returned with invalid information. Why is this a problem?
> > This again seems like an SM issue.
> >
> > - Fab
> >
> > _______________________________________________
> > openib-windows mailing list
> > openib-windows at openib.org
> > http://openib.org/mailman/listinfo/openib-windows
> >
>
More information about the ofw
mailing list