[openib-general] [PATCH] Initial CM implementation
Sean Hefty
sean.hefty at intel.com
Mon Jan 17 14:48:40 PST 2005
>Just a quick read through, more comments later:
>
> int local_cm_response_timeout:5;
> int flow_control:1;
>
>These should be "unsigned," not "int." A signed 1-bit int doesn't
>make much sense, and I think you'll probably run into sign trouble if
>someone passes a local_cm_response_timeout of 20 or something for the
>5-bit field.
I'll change these to unsigned.
>In cm_send_handler(), you have:
>
> case IB_WC_RESP_TIMEOUT_ERR:
> cm_process_send_timeout(msg);
> break;
>
>but can this ever happen? I thought that the MAD layer always treated
>CM MADs as unsolicited which means that responses are not matched with
>requests so timeouts don't happen. Or am I misunderstanding the MAD
>layer semantics?
My intent is to use the MAD layer timeout and retry code. E.g. the CM
will send a REQ. The MAD layer will retry the REQ for the CM. Once
a REP is received, the CM will cancel the REQ. So, the CM will do the
response matching, but the timeout/retries are done by the MAD code.
There's a potential race between receiving a response and receiving a
timeout. I have a way to handle it, but I need to go back and see if
I set the right fields to do so.
>I see a lot of setting of state to TIMEWAIT, but I don't see where the
>TIMEWAIT timeout happens.
TIMEWAIT is a big todo. My intent is to use the same work queue that
receive handling uses.
Thanks for the comments.
- Sean
More information about the general
mailing list