[openib-general] [RFC] IB address translation using ARP

Mon Oct 10 07:56:49 PDT 2005

Hi Tom,

On Sun, 2005-10-09 at 13:10, Tom Tucker wrote: 
> On Sun, 2005-10-09 at 07:57 -0700, Sean Hefty wrote:
> > >It is theoretically possible to support all this on an IPoIB based
> > >network. Multiple subnets, multiple routes to remote peers, ICMP
> > >redirect, multiple IP addresses for each physical interface, yada yada
> > >yada. But IMHO, the only way to do this would be to tie directly into
> > >the existing routing,  ARP, ICMP, etc... subsystems in Linux. Otherwise
> > >you'll end up recreating a gigantic (and I mean GIGANTIC) amount of
> > 
> > The current implementation ties into the standard Linux ARP tables.  If
> > connections were made over TCP/IP, using IPoIB, then I don't think that there
> > would be any issues.  The issues only arise because of the desire to use TCP/IP
> > network addresses over a non-TCP/IP network.
> > 
> > >code. This belief is why I've been a proponent of mapping GIDs to one
> > >and only one IP address and treating it for management purposes as the
> > >equivalent of an IP address. Without this, the whole mechanism for
> > >determining routes, etc.. breaks down. If you treat the GID like a MAC
> > >address -- it breaks, because a MAC address can have multiple IP
> > >addresses -- the observation that lead to the conclusion that ATS was
> > >broken in the first place.
> > 
> > We should be able to handle the case where a GID has multiple IP addresses bound
> > to it.  But even if we added a 1:1 restriction, the connection over IB issue
> > still exists.
> 
> I agree, except for RARP.

Not sure what you mean "except for RARP". Can you elaborate ?

[snip...]

> > I
> > don't view a GID as an IP address because we're not sending and receiving IP
> > packets on the GID.  IPoIB treats GIDs as only part of a MAC address, which I
> > think is the proper view. 
> >
> > Anyway, returning back to the original problem of connecting to an IB gateway if
> > a given a destination IP address on a different subnet...  I'm slowly convincing
> > myself that either the CMA or AT should do this.  (I believe that the ib_addr
> > code will do this now, but still wasn't sure that we wanted it to.)
> > 
> 
> IMHO, you need a service separate from the CMA to do address
> translation. My (iWARP's) rationale for this is that there are two
> clients of the service, the CM and IP. For CM, you need it to elect a
> route and thereby a local interface. For IP you need it because routes
> change and ARP entries time out. 
> 
> BTW, can you educate me ... is the following what you're thinking:
> 
> On the client side...
> 
> - route is discovered by looking at the Linux routing table
> - local interface is IPoIB (looks at rdma_ptr embedded in netdev struct)
> - send ARP AT message over local IB interface

It's just a normal IPoIB ARP to the destination IP address initiated by
AT. (With ATS, it could have been an SA Get ServiceRecord as an
alternative).

I think the current CMA code handles client above and server but not
(bridging) gateway below.

> At the gateway...bridging to IP

> - ARP AT query received on IB interface
> - Lookup route to destination IP address in gateway's route table.
> - If next hop's Ethernet address is already known, it is returned
                  ^^^^^^^^
                  hardware (may not be ethernet)

> - Otherwise, local interface identified is IPoEthernet
> - New ARP query goes out on the local interface from the route
> - When response comes back, answer is returned.

> At the gateway...bridging to IPoIB
> 
> - ARP AT message received on IB interface, delivered to AT
> - Lookup route to destination IP address in gateway's route table
> - If next hop's Ethernet address is already known, it is returned
> - otherwise, local interface identified in route is IPoIB
> - New ARP AT query goes out on the local interface
> - When response comes back, answer is returned.

-- Hal