[openib-general] [PATCH 3/3] iWARP CM

Tom Tucker tom at opengridcomputing.com
Thu Mar 9 12:55:06 PST 2006


On Thu, 2006-03-09 at 12:33 -0800, Sean Hefty wrote:
> >Well, it depends on our philosophy. If kernel users are trusted -- no.
> >If not, then we can either check them here and fail gracefully, or bug-
> >check the app later. Currently, _we_ are the app, so maybe we should
> >dump-em.
> 
> The philosophy is that kernel users are trusted.  A bug check lets us discover
> the problem earlier.

I agree. I have removed all these checks.... 
> 
>  > +		if (ret == 0) {
> >> > +			struct iw_cm_event event;
> >> > +			event.event = IW_CM_EVENT_LLP_DISCONNECT;
> >> > +			event.provider_id = cm_id->provider_id;
> >> > +			event.status = 0;
> >> > +			event.local_addr = cm_id->local_addr;
> >> > +			event.remote_addr = cm_id->remote_addr;
> >> > +			event.private_data = 0;
> >> > +			event.private_data_len = 0;
> >> > +			cm_event_handler(cm_id, &event);
> >>
> >> See comments in iw_cm_accept().
> >>
> >
> >> > +		}
> >> > +		cm_id->state = IW_CM_STATE_IDLE;
> >>
> >> Why is state set to CLOSING above, then set it to IDLE?  There may be a valid
> >> reason, I'm just trying to understand why.
> >>
> >
> >While in the client callback handler and through potential downcalls, I
> >wanted to have a separate state. IDLE means it can be reused. I thought
> >we should not be able to do so until after we've popped back up through
> >this call chain.... that was the idea anyway.
> 
Downcalls from the user and upcalls from the provider need to be handled
correctly if the both both peers try to disconnect concurrently or if
the app calls disconnect or any other function from the disconnect
handler. 

In this particular case, the close is synchronous, i.e. _after_ the call
to disconnect returns 0, the app can assume that the cm_id is in the
IDLE state and can be reused. 

When I resubmit, it will be clear -- hopefully. I also have a state
diagram I should probably put out there somewhere.

> If I understood correctly, cm_event_handler invokes a callback via another
> thread.  The state is being changed in a separate thread here.  We don't know
> what state the cm_id will be when the user's callback is invoked.  How can a
> user tell when the cm_id can be reused?
> 
> >> > +	switch (work->event.event) {
> >> > +	case IW_CM_EVENT_CONNECT_REQUEST:
> >> > +		rc = cm_conn_req_handler(work);
> >> > +		break;
> >> > +	case IW_CM_EVENT_CONNECT_REPLY:
> >> > +		rc = cm_conn_rep_handler(work);
> >> > +		break;
> >> > +	case IW_CM_EVENT_ESTABLISHED:
> >> > +		rc = cm_conn_est_handler(work);
> >> > +		break;
> >> > +	case IW_CM_EVENT_LLP_DISCONNECT:
> >> > +	case IW_CM_EVENT_LLP_TIMEOUT:
> >> > +	case IW_CM_EVENT_LLP_RESET:
> >> > +	case IW_CM_EVENT_CLOSE:
> >> > +		rc = cm_disconnect_handler(work);
> >> > +		break;
> >> > +	}
> >> > +
> >> > +	/* The reference was done in cm_event_handler. */
> >> > +	iwcm_deref_id(cm_id_priv);
> >>
> >> The cm_*_handlers need to release the reference in order to process
> >destruction.
> >>   That or use their return code to determine if the reference should be
> >> released, or if cleanup needs to be invoked.
> >What I'm going to do is have the return value specify whether or not the
> >cm_id should be destroyed. Also, the kfree of the work is being moved
> >here.
> 
> Okay - note that cm_conn_req_handler() needs to result in destroying the newly
> created cm_id, and not the one stored in work->cm_id, which should be the
> listen.

Yes. this is true. In fact, cm_conn_req_handler should never return a
non-zero value because the listener should not be destroyed due to an
error on the child.
> 
> - Sean




More information about the general mailing list