[openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own
James Lentini
jlentini at netapp.com
Mon Jun 13 15:33:56 PDT 2005
On Mon, 13 Jun 2005, Hal Rosenstock wrote:
halr> On Wed, 2005-06-08 at 17:53, James Lentini wrote:
halr> > On Wed, 8 Jun 2005, Hal Rosenstock wrote:
halr> >
halr> > halr> On Wed, 2005-06-08 at 11:44, James Lentini wrote:
halr> > halr> > We interpreted the above to mean "give the connection protocol as
halr> > halr> > much time as it needs to establish a connection, but don't mask
halr> > halr> > errors (no path to the remove node, etc.)". For that reason we changed
halr> > halr> > the variable name to DAT_TIMEOUT_MAX.
halr> > halr>
halr> > halr> But if the REQ is lost, the timeout is really really long (longer than
halr> > halr> most will wait for an error).
halr> >
halr> > If a user doesn't want to wait DAT_TIMEOUT_MAX time, it can pass a
halr> > smaller amount of time to dat_ep_connect. Does this satisfy your
halr> > requirements?
halr>
halr> Is it the intended that the only way out is via user intervention (e.g.
halr> ctl-C) ? If one connection attempt (REQ) is made and it is lost, then
halr> there is no chance of it completing and the user needs to intervene.
Why does the user need to intervene? Did I misunderstanding the CM
API?
When dapl_ep_connect() is called with a timeout value of
DAT_TIMEOUT_MAX, DAPL passes ib_sen_cm_req the value 0x1F in the
ib_cm_req_param structure's remote_cm_response_timeout value. My
understanding was that this is the maximum timeout and that once it
expires the CM will inform the user that the REQ timed out.
halr> If that is the intended behavior, we are there. (This (lost REQ)
halr> can even occur when the timeout is non infinite too).
We didn't intend for the active side to wait forever if a REQ was
lost.
halr>
halr> An alternative (as Sean suggested) is to continually retry (at a
halr> periodicity below the supplied timeout) until the time period specified
halr> expires. That seems to be better (at least to me and Sean) in terms of
halr> handling the lost REQ case. As retries is not part of the API for
halr> connect, I would presume the implementor is free to what they want under
halr> the covers of dapl_ib_connect.
You're correct.
halr>
halr> -- Hal
halr>
More information about the general
mailing list