[openib-general] Re: [PATCH] [CM] 3/6 SDP updates to bind cm_id's to a device
Sean Hefty
mshefty at ichips.intel.com
Wed Sep 7 09:44:54 PDT 2005
Michael S. Tsirkin wrote:
> Could you please elaborate on why is an event driven model more reliable than
> the state driven one?
> It certainly seems to require more code: isnt cm_id->state set by CM to
> a valid value? It seems that cm needs to track connection state
> anyway as per "12.9.2 Invalid State Input Handling", to know which messages
> are legal. Is this done? If so why isnt it a good idea to reuse that state
> in SDP?
The state of the cm_id is controlled by the CM and can change at any time as a
result of processing a received MAD. It's only exposed for debug purposes.
>>+ result = ib_cm_listen(hca->listen_id,
>>+ cpu_to_be64(SDP_MSG_SERVICE_ID_VALUE),
>>+ cpu_to_be64(SDP_MSG_SERVICE_ID_MASK));
>>+ if (result) {
>>+ sdp_warn("Error <%d> listening for SDP connections", result);
>>+ goto error;
>>+ }
>>+
>> ib_set_client_data(device, &sdp_client, hca);
>>
>> return;
>
>
> Can a listen event arrive before we call ib_set_client_data?
Yes. We may need to swap those two statements.
>> error:
>>+ if (!IS_ERR(hca->listen_id))
>>+ ib_destroy_cm_id(hca->listen_id);
>>+
>> list_for_each_entry_safe(port, tmp, &hca->port_list, list) {
>> list_del(&port->list);
>> kfree(port);
>>@@ -1838,6 +1856,9 @@ static void sdp_device_remove_one(struct
>> return;
>> }
>>
>>+ if (!IS_ERR(hca->listen_id))
>>+ ib_destroy_cm_id(hca->listen_id);
>>+
>> list_for_each_entry_safe(port, tmp, &hca->port_list, list) {
>> list_del(&port->list);
>> kfree(port);
>
>
> I'd prefer separate labels for two errors, instead of testing IS_ERR twice.
So do I, but I was trying to match the coding style used throughout the SDP
code. Fixing the error handling seems like another set of changes to me.
>>+ if (event->event != IB_CM_REQ_RECEIVED) {
>>+ conn = sdp_conn_table_lookup(hashent);
>>+ if (conn)
>>+ sdp_conn_lock(conn);
>>+ else
>> return -EINVAL;
>>- }
>>+ /* Can this fail? Why not just set context = conn? */
>>+ }
>
>
> Regarding the question: the first thing we do in sdp_conn_put
> is sdp_conn_table_remove, so I think this lookup can fail.
I should have removed that comment. I'm not overly familiar with the SDP code,
but it seems wrong to need to verify that a requested callback can be executed.
By returning -EINVAL above, the cm_id associated with the callback will be
destroyed. This indicates to me that either that cm_id will be destroyed twice
by SDP (once when the SDP conn object is destroyed and again here), or that SDP
does not usually destroy cm_id's as part of its normal cleanup. I'm also
wondering about possible race conditions that could occur between calling
table_lookup and conn_lock, but I don't know the code well enough to say if one
exists.
>>- switch (cm_id->state) {
>>- case IB_CM_REQ_RCVD:
>>+ switch (event->event) {
>>+ case IB_CM_REQ_RECEIVED:
>> result = sdp_cm_req_handler(cm_id, event);
>> break;
>>- case IB_CM_REP_RCVD:
>>+ case IB_CM_REP_RECEIVED:
>> result = sdp_cm_rep_handler(cm_id, event, conn);
>> break;
>>- case IB_CM_IDLE:
>>+ case IB_CM_REQ_ERROR:
>>+ case IB_CM_REP_ERROR:
>>+ case IB_CM_REJ_RECEIVED:
>>+ case IB_CM_TIMEWAIT_EXIT:
>> result = sdp_cm_idle(cm_id, event, conn);
>> break;
>>- case IB_CM_ESTABLISHED:
>>+ case IB_CM_RTU_RECEIVED:
>>+ case IB_CM_USER_ESTABLISHED:
>> result = sdp_cm_established(cm_id, event, conn);
>> break;
>>- case IB_CM_DREQ_RCVD:
>>+ case IB_CM_DREQ_RECEIVED:
>> result = sdp_cm_dreq_rcvd(cm_id, event, conn);
>> if (result)
>> break;
>> /* fall through on success to handle state transition */
>>- case IB_CM_TIMEWAIT:
>>+ case IB_CM_DREQ_ERROR:
>>+ case IB_CM_DREP_RECEIVED:
>> result = sdp_cm_timewait(cm_id, event, conn);
>> break;
>> default:
>>- sdp_dbg_warn(conn, "Unexpected CM state <%d>", cm_id->state);
>>+ sdp_dbg_warn(conn, "Unhandled CM event <%d>", event->event);
>> result = -EINVAL;
>> }
>> /*
>
> Seems more code. Could you please elaborate on why is this more correct?
See comment above. Using the cm_id state is incorrect as it can change while a
client's callback is running.
- Sean
More information about the general
mailing list