[openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

James Lentini jlentini at netapp.com
Mon Jun 13 15:33:56 PDT 2005



On Mon, 13 Jun 2005, Hal Rosenstock wrote:

halr> On Wed, 2005-06-08 at 17:53, James Lentini wrote: 
halr> > On Wed, 8 Jun 2005, Hal Rosenstock wrote:
halr> > 
halr> > halr> On Wed, 2005-06-08 at 11:44, James Lentini wrote:
halr> > halr> > We interpreted the above to mean "give the connection protocol as 
halr> > halr> > much time as it needs to establish a connection, but don't mask 
halr> > halr> > errors (no path to the remove node, etc.)". For that reason we changed 
halr> > halr> > the variable name to DAT_TIMEOUT_MAX.
halr> > halr> 
halr> > halr> But if the REQ is lost, the timeout is really really long (longer than
halr> > halr> most will wait for an error). 
halr> > 
halr> > If a user doesn't want to wait DAT_TIMEOUT_MAX time, it can pass a 
halr> > smaller amount of time to dat_ep_connect. Does this satisfy your 
halr> > requirements?
halr> 
halr> Is it the intended that the only way out is via user intervention (e.g.
halr> ctl-C) ? If one connection attempt (REQ) is made and it is lost, then
halr> there is no chance of it completing and the user needs to intervene. 

Why does the user need to intervene? Did I misunderstanding the CM 
API? 

When dapl_ep_connect() is called with a timeout value of 
DAT_TIMEOUT_MAX, DAPL passes ib_sen_cm_req the value 0x1F in the 
ib_cm_req_param structure's remote_cm_response_timeout value. My 
understanding was that this is the maximum timeout and that once it 
expires the CM will inform the user that the REQ timed out.

halr> If that is the intended behavior, we are there. (This (lost REQ) 
halr> can even occur when the timeout is non infinite too).

We didn't intend for the active side to wait forever if a REQ was 
lost.

halr> 
halr> An alternative (as Sean suggested) is to continually retry (at a
halr> periodicity below the supplied timeout) until the time period specified
halr> expires. That seems to be better (at least to me and Sean) in terms of
halr> handling the lost REQ case. As retries is not part of the API for
halr> connect, I would presume the implementor is free to what they want under
halr> the covers of dapl_ib_connect.

You're correct.

halr> 
halr> -- Hal
halr> 



More information about the general mailing list