[ofa-general] patch to ib_addr for sending arps
Jason Gunthorpe
jgunthorpe at obsidianresearch.com
Mon Jul 13 13:20:06 PDT 2009
On Mon, Jul 13, 2009 at 10:14:05AM -0700, leo.tominna at oracle.com wrote:
> Hi Jason,
>
> Thanks for clearing up the use case. In that case doing ip_dev_find to set oif
> would be wrong since it would not work correctly in the case the same IP is
> associated with two devices. By just setting s_addr before calling
> ip_route_output_key in addr_send_arp, that should take care of it (the initial
> patch sent).
>
> From what I can tell, this just fixes the policy routing case, without
> affecting/addressing configurations that are using default routing. I need to
> see why RDS/IB gets stuck in this case. My guess is that hardware addresses
> don't get resolved correctly (as expected), and two sides of an IB connection
> trip over a mismatch in what hardware a peer thinks its using.
>
> But that is another issue that can be fixed independently. I'll add some
> prints to see what might be happening.
So, I think they might be related, at least, the current arrangement
seems straneg to my eyes.
There should be only one route lookup and it should not be in the
send_arp function.
The ip_dev_find (and related) in addr_resolve_local is the main
culprit.. As far as I can see there should be one call to the route
function (ip_route_output_key??) and that result should replace
ip_dev_find and the dst and fl, etc should be passed down to send_arp
and the other places that are calling ip_route_output_key. (This is
the original problem I noted in that old thread)
Multiple lookups like this seem like they should be racy against table
updates.
So you patch makes the arp part use a potentialy different device than
the bind part, which is no good.. (yes?)
Really, the best thing to do here is carefully look at the tcp and udp
call paths and duplicate their function calls into the route
module. It should be the same logic flow.
Jason
More information about the general
mailing list