[openib-general] [PATCH] Initial CM implementation

Sean Hefty sean.hefty at intel.com
Mon Jan 17 14:48:40 PST 2005


>Just a quick read through, more comments later:
>
>	int	local_cm_response_timeout:5;
>	int	flow_control:1;
>
>These should be "unsigned," not "int."  A signed 1-bit int doesn't
>make much sense, and I think you'll probably run into sign trouble if
>someone passes a local_cm_response_timeout of 20 or something for the
>5-bit field.

I'll change these to unsigned.

>In cm_send_handler(), you have:
>
>	case IB_WC_RESP_TIMEOUT_ERR:
>		cm_process_send_timeout(msg);
>		break;
>
>but can this ever happen?  I thought that the MAD layer always treated
>CM MADs as unsolicited which means that responses are not matched with
>requests so timeouts don't happen.  Or am I misunderstanding the MAD
>layer semantics?

My intent is to use the MAD layer timeout and retry code.  E.g. the CM
will send a REQ.  The MAD layer will retry the REQ for the CM.  Once
a REP is received, the CM will cancel the REQ.  So, the CM will do the
response matching, but the timeout/retries are done by the MAD code.

There's a potential race between receiving a response and receiving a
timeout.  I have a way to handle it, but I need to go back and see if
I set the right fields to do so.

>I see a lot of setting of state to TIMEWAIT, but I don't see where the
>TIMEWAIT timeout happens.

TIMEWAIT is a big todo.  My intent is to use the same work queue that
receive handling uses.

Thanks for the comments.

- Sean




More information about the general mailing list