[openib-general] [PATCH] IB_CM: Limit the MRA timeout

Sean Hefty mshefty at ichips.intel.com
Wed Oct 4 10:43:03 PDT 2006


Michael S. Tsirkin wrote:
>>There's several timeout values transfered and used by the cm, most notably the 
>>remote cm response timeout and packet life time.  Does it make more sense to 
>>have a single, generic timeout maximum instead?
> 
> Hmm. I'm not sure - we are working around an actual broken implementation here -
> what do you think?

I wasn't sure either.  The MRA timeout is a combination of the packet life time 
+ service timeout, which made me bring this up.  The patch only handles the 
service timeout portion, so we end up in the same situation if a large packet 
life time is ever used.

>>Would it make more sense to 
>>enable the maximum(s) by default, since we're dependent upon values received 
>>over the network?
> 
> I think it would.

So do I.

The CM has checks to bring out of range values into range, but at the maximum, 
we get a timeout of about 2.5 hours.  Multiple that by 15 retries, and the cm 
can literally spend all day retrying a request.

I was considering dropping the default maximum down to around 4-8 seconds, which 
with retries still gives us about a minute to timeout a request.  The default 
maximum would apply to local and remote cm timeouts, packet life time, and 
service timeout, but could be overridden by the user.  (Basically, with Ishai's 
patch: rename mra_timeout_limit to timeout_limit, set to a default of 20, and 
replace occurrences of '31' in the code with timeout_limit.)

- Sean




More information about the general mailing list