[Openib-windows] Interoperability of the current stack and the Linux 1.8.X stack
Erez Cohen
erezc at mellanox.co.il
Mon Jul 10 08:26:02 PDT 2006
Hi Fab,
I think all gen1 and probably early gen2 had this issues. Since it was
released some time ago there is no much we can do to modify/fix these
releases. I'm not sure about other SMs.
The reason people are still using this releases is the OS support and
stability. New releases do not support "old" OS and kernels.
Since this fix is not active by default and requires activation via a
registry key I think it is not that risky and will allow users to work
with the Windows stack alongside Linux stack.
Best regards,
Erez Cohen
Field Application and Support Engineer
Mellanox Technologies Ltd.
Tel : + 972 - 4 - 9097200 ext 378
Cell : + 972 - 54 - 5468801
Fax : + 972 - 4 - 9593245
www.mellanox.com
-----Original Message-----
From: openib-windows-bounces at openib.org
[mailto:openib-windows-bounces at openib.org] On Behalf Of Fabian Tillier
Sent: Monday, July 10, 2006 4:58 PM
To: Tzachi Dar
Cc: openib-windows at openib.org
Subject: Re: [Openib-windows] Interoperability of the current stack and
the Linux 1.8.X stack
Hi Tzachi,
On 7/10/06, Tzachi Dar <tzachid at mellanox.co.il> wrote:
>
> Hi Fab,
>
> One of our customers has saw an issue that prevented the new code from
> joining to clusters were old Linux installations exists.
Which Linux stack is this? Any chance the customer could upgrade to
something less broken? More below.
> The problem is related to joining to existing broadcast groups on
IPOIB.
>
> As the problem is in the old code, I have prepared a fix that is
> dependent on a registry key and will not be run by default.
Why not just fix the old code?
> As for the second issue (parameters for joining a group
> mcast_req.member_rec.mtu = 0; mcast_req.member_rec.rate = 0; ).
> This is the way that Linux gen 2 is using, so I suggest that we will
> always use it like this.
>
> Thanks
> Tzachi
>
> Index: ipoib_port.c
> ===================================================================
> --- ipoib_port.c (revision 1524)
> +++ ipoib_port.c (working copy)
> @@ -4813,7 +4813,8 @@
> IPOIB_ENTER( IPOIB_DBG_MCAST );
>
> /* Check that the rate is realizable for our port. */
> - if( p_port->ib_mgr.rate < (p_member_rec->rate & 0x3F) )
> + if( p_port->ib_mgr.rate < (p_member_rec->rate & 0x3F) &&
> + (g_ipoib.bypass_check_bcast_rate == 0))
I'm confused by this change. Why would it not be correct to check that
the local port rate is greater than or equal to the broadcast group's
rate? This doesn't even seem like a stack issue, but rather an SM
issue.
> {
> /*
> * The MC group rate is higher than our port's rate. Log an error
> @@ -4825,7 +4826,7 @@
> EVENT_IPOIB_BCAST_RATE, 2,
> (uint32_t)(p_member_rec->rate & 0x3F),
> (uint32_t)p_port->ib_mgr.rate );
> - return IB_ERROR;
> + return IB_ERROR;
> }
>
> /* Join the broadcast group. */
> @@ -5226,6 +5227,8 @@
> mcast_req.member_rec = p_port->ib_mgr.bcast_rec;
> /* Clear fields that aren't specified in the join */
> mcast_req.member_rec.mlid = 0;
> + mcast_req.member_rec.mtu = 0;
> + mcast_req.member_rec.rate = 0;
I don't see why this is needed. The MTU and rate must be the same as
the broadcast group, so this would only be a problem if the broadcast
group was returned with invalid information. Why is this a problem?
This again seems like an SM issue.
- Fab
_______________________________________________
openib-windows mailing list
openib-windows at openib.org
http://openib.org/mailman/listinfo/openib-windows
More information about the ofw
mailing list