[openib-general] [PATCH] IB_CM: Limit the MRA timeout
Sean Hefty
mshefty at ichips.intel.com
Wed Oct 4 10:43:03 PDT 2006
Michael S. Tsirkin wrote:
>>There's several timeout values transfered and used by the cm, most notably the
>>remote cm response timeout and packet life time. Does it make more sense to
>>have a single, generic timeout maximum instead?
>
> Hmm. I'm not sure - we are working around an actual broken implementation here -
> what do you think?
I wasn't sure either. The MRA timeout is a combination of the packet life time
+ service timeout, which made me bring this up. The patch only handles the
service timeout portion, so we end up in the same situation if a large packet
life time is ever used.
>>Would it make more sense to
>>enable the maximum(s) by default, since we're dependent upon values received
>>over the network?
>
> I think it would.
So do I.
The CM has checks to bring out of range values into range, but at the maximum,
we get a timeout of about 2.5 hours. Multiple that by 15 retries, and the cm
can literally spend all day retrying a request.
I was considering dropping the default maximum down to around 4-8 seconds, which
with retries still gives us about a minute to timeout a request. The default
maximum would apply to local and remote cm timeouts, packet life time, and
service timeout, but could be overridden by the user. (Basically, with Ishai's
patch: rename mra_timeout_limit to timeout_limit, set to a default of 20, and
replace occurrences of '31' in the code with timeout_limit.)
- Sean
More information about the general
mailing list