[Openib-windows] Voltaire IP Router interoperability patch

Guy Corem guyc at voltaire.com
Thu Jun 8 01:48:05 PDT 2006


Hi Fab,

Voltaire router has two QPs.
One QP is used for slow path and communication with the router IPoIB
interface.
The other QP is used for fast path, i.e. when IB node is talking with an
IP node (or vice versa).

Since the endpoints are being keyed by GID and not by GID+QP (or IP),
the original IPoIB code keep switching between the fast path QP and the
slow path.

The slow path QP is being used to answer the ARP replies (and requests),
but the QPN in the ARP reply is the fast path QP.

The code is constructed in such way, that the endpoint QPN is being
transformed from initially slow path QPN to the fast path QPN, but not
the other way.

It's a hack, and better solution will be to reference endpoints using
the GID+QP pair. I've tried to localize the changes specifically to
Voltaire's router, and keep the original code path otherwise intact.

If you need any more explanations, I'll be happy to provide.

BTW:
In theory, the Router IPoIB interface should still be reachable, because
there is a tunneling from the router fast path QP to the slow path QP
(according to the destination IP). I'm saying in theory, because in
practice there is currently a bug in the router that prevents it. So
indeed, the current solution prevents communication with router IPoIB
once a communication with an IP host was started.

Guy

-----Original Message-----
From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com] On Behalf
Of Fabian Tillier
Sent: Thursday, June 01, 2006 1:17 AM
To: Guy Corem
Cc: openib-windows at openib.org
Subject: Re: [Openib-windows] Voltaire IP Router interoperability patch

Hi Guy,

On 5/29/06, Guy Corem <guyc at voltaire.com> wrote:
>
> Hi Fab,
>
> The following patch solves interoperability issue with Voltaire
Ethernet to
> InfiniBand gateway.
>
> Please apply.

I'm trying to understand what the patch is trying to do.

When performing the endpoint lookup, it seems that the QPN is ignored
(that is, it is not updated in the endpoint if the WC shows a
different QPN than the current for that endpoint.

Then when receiving an ARP, the code flushes the endpoints as it used
to if the source GID and LID don't match.  The change revolves around
the QPN checks, and this is where I got confused.

The original code checked that the remote QPN in the WC matches the
QPN in the endpoint.  You added a check that for a voltaire router
GID, both the QPN in the WC and the QPN in the endpoint match the QPN
in the IPoIB source hardware address's QPN field.  I don't understand
this.

Are there cases where the hardware address in the ARP has a different
QPN than the sending QP?  It seems this check is stricter than the
original check.  Is this the right check to do for all endpoints (that
is, check for consistency between the sending QP and the ARP payload)?

Can you explain a bit the behavior of the Voltaire router with respect
to ARPs and QPNs?

Thanks,

- Fab




More information about the ofw mailing list