[ofa-general] patch to ib_addr for sending arps

leo.tominna at oracle.com leo.tominna at oracle.com
Mon Jul 13 13:35:08 PDT 2009


Right, there should only be one route lookup call.  And the send_arp 
should match what TCP/UDP are doing, I'm pretty sure they don't use 
neigh_event_send like ib_addr is, or if they do, they are not using 
ip_route_output_key to get the neighbor entry.  I could not find the 
code that generates arps for these protocols. 

addr_resolve_local should do the same as you said, although I didn't 
exercise this code when testing, I was testing with a single/stable 
remote node's IP.  I'll see if there's a better fix then.

Thanks for your help.

Leo Tominna

On 7/13/2009 1:20 PM, Jason Gunthorpe wrote:
> On Mon, Jul 13, 2009 at 10:14:05AM -0700, leo.tominna at oracle.com wrote:
>   
>> Hi Jason,
>>
>> Thanks for clearing up the use case.  In that case doing ip_dev_find to set oif
>> would be wrong since it would not work correctly in the case the same IP is
>> associated with two devices. By just setting s_addr before calling
>> ip_route_output_key in addr_send_arp, that should take care of it (the initial
>> patch sent).
>>
>> From what I can tell, this just fixes the policy routing case, without
>> affecting/addressing configurations that are using default routing.  I need to
>> see why RDS/IB gets stuck in this case.  My guess is that hardware addresses
>> don't get resolved correctly (as expected), and two sides of an IB connection
>> trip over a mismatch in what hardware a peer thinks its using.
>>
>> But that is another issue that can be fixed independently.  I'll add some
>> prints to see what might be happening.
>>     
>
> So, I think they might be related, at least, the current arrangement
> seems straneg to my eyes.
>
> There should be only one route lookup and it should not be in the
> send_arp function.
>
> The ip_dev_find (and related) in addr_resolve_local is the main
> culprit.. As far as I can see there should be one call to the route
> function (ip_route_output_key??) and that result should replace
> ip_dev_find and the dst and fl, etc should be passed down to send_arp
> and the other places that are calling ip_route_output_key. (This is
> the original problem I noted in that old thread)
>
> Multiple lookups like this seem like they should be racy against table
> updates.
>
> So you patch makes the arp part use a potentialy different device than
> the bind part, which is no good.. (yes?)
>
> Really, the best thing to do here is carefully look at the tcp and udp
> call paths and duplicate their function calls into the route
> module. It should be the same logic flow.
>
> Jason
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090713/a4a83032/attachment.html>


More information about the general mailing list