***SPAM*** Re: [ofw] [IPoIB] Problem with "Avoid the SM" patch

Hal Rosenstock hal.rosenstock at gmail.com
Thu Sep 18 11:35:10 PDT 2008


On Thu, Sep 18, 2008 at 12:03 PM, Fab Tillier
<ftillier at windows.microsoft.com> wrote:
>>> I understood that Fab checked this issue (by 10 retries of 1 second TO)
>>> and found that it didn't help there. Yet another try can be enlarging
>>> the TO to be 5 sec and sending less retries
>>
>> I think some exponential backoff strategy with some randomization
>> might be better.
>
> The problem with this is that the layers above IPoIB (namely the network stack generating ARP requests and expecting ARP responses) doesn't have visibility into this backoff strategy, and will give up on an ARP request if the response doesn't come back in time.  The response could be delayed for a long time if the SM isn't responding to queries in a timely manner, since IPoIB needs to resolve the path in order to send the unicast response.  I don't know the timeout for an ARP response, but I'd be surprised if it was 10 seconds, let alone whatever you would get with exponential backoff.
>
> I initially tried exponential backoff to resolve the problem I was seeing with these MPI apps, and it didn't work because of this.  That's when I set out on a path to take the SM out of the equation as much as possible.

Yes, I think we went through this before :-(

10 seconds is a long time to wait for the SM to respond anyhow.

I understand why ofw is bypassing SA but in the long term this is a
_bad_ thing IMHO. I think the real solution needs to be in increasing
the SA scalability. I don't think there's been any serious effort on
that. I'd also be concerned about whether all SM flavors in some
configuration need to support a certain sustained (and peak)
transaction rates.

-- Hal

>
> -Fab
>



More information about the ofw mailing list