[openib-general] Re: [PATCH] [ib_addr] generalize address to RDMA device translation

Tom Tucker tom at opengridcomputing.com
Tue Jan 3 12:24:57 PST 2006


On Tue, 2006-01-03 at 12:05 -0800, Sean Hefty wrote:
> Tom Tucker wrote:
> > ARP Resolve
> > 
> > The iWARP side needs to be able to resolve an IP address to an Ethernet
> > address. Today this is not done for iWARP and it works because the
> > AMSO1100 does this itself in the hardware. Other iWARP devices probably
> > don't. This means that the logic in ib_at needs to be extended on the
> > iWARP side to call neigh_event_send (instead of arp_send) to resolve an
> > IP to an Ethernet address.  The current method of calling arp_send
> > directly and "sniffing" for arp replies is probably not the best way to
> > go long term. It would be better to register for neighbor update events
> > (new mechanism) and be notified when the neighbor entry gets resolved.
> > This is better for two reasons: 1) it doesn't duplicate code already in
> > Linux, and 2) unlike IB, Ethernet MAC addresses may change for the next
> > hop while the connection is still active. The provider needs to know
> > this so it's hardware ARP tables can be updated.
> 
> To be clear, the CMA uses ib_addr, and not ib_at, which is a different module.

Absolutely. I was dumping a bunch of loosely related concerns...

> 
> I'm not sure I understand what's wrong with sniffing arp replies.  There's very 
> little code (about a dozen lines) in ib_addr to handle arps.  It also seems that 
> it's just as unlikely that the mapping from an IP address to a hardware address 
> will change for Ethernet as it does for IB.

Agreed -- It is unlikely. The more common case is a re-arp when the arp
entry times out (typically 15 minutes).


> Are you trying to deal with a destination IP address of a connection that is not 
> on the local subnet?  If this is the case, then this seems like a separate issue 
> than address resolution.

Yes, and no. The IP address being resolved is the peer if it is on the
same subnet. If it is not, then the IP address being resolved is for the
next hop.

> 
> > ROUTE Changes
> > 
> > Two obvious cases, 1) the next hop changes due to normal network least-
> > cost routing, and 2) the user changes a route manually. Both events
> > would require the iWARP provider to be notified (via an event again) and
> > update its hardware
> 
> Maybe this can be included as part of some sort of automatic "failover"? 
> Otherwise, I'm not sure how this functionality maps to IB.  It's not a big deal 
> if it doesn't, but it'd be nice to keep similarities where possible.

> > PathMTU
> > 
> > The new route to the remote peer has a hop with a smaller MTU than we're
> > currently using. Ouch! All my packets are going to be dropped until I
> > reduce my path MTU. The provider can't know unless he is either
> > filtering all ICMP traffic himself ("evil") or is notified via an event
> > ("nice"). 
> > 
> > So all this said, my little brain had imagined this logic going in and
> > around the ib_at module in a wonderfully crafted bit of algorithmic art
> > -- once I figured out how to do it all ;-)
> > 
> > It sounds like you're beating the same bushes. How would you like to
> > proceed?
> 
> I'd like to define a set of changes to ib_addr and the rdma_cm that makes it 
> easier to support multiple RDMA devices, then evolve the codebase from there. 
> My hope is to keep the network addressing ugliness in ib_addr.
> 
> The changes to the ib_addr interface is based on trying to determine what might 
> help support iWarp after looking at your patch.  If the changes appear to be a 
> step in the right direction, then I will commit them.  The essence of the change 
> is that ib_addr leaves the interpretation of the addresses up to the caller, 
> which may still be a good thing even if it doesn't directly make supporting 
> iWarp any easier.

My 2 cents is that it's a good thing. Sorry to throw 10 lbs of @#^$ in
with this bag... I was core dumping.

> 
> - Sean




More information about the general mailing list