[openib-general] Re: [PATCH 3/3] iWARP CM - iWARPConnectionManager

Tom Tucker tom at opengridcomputing.com
Wed Mar 22 17:18:17 PST 2006


On Wed, 2006-03-22 at 16:35 -0800, Sean Hefty wrote:
> >> Tom, can you post more info on the various events and their relationship
> >> to the cm_id states?  Maybe that will help?
> >[] around a string represent client calls
> ><> around a string represent provider events
> 
> Thanks - just to be clear, my concern is that neither a client nor the iwcm try
> to access memory that has been freed.  I'm trying to limit a state/event model
> discussion to this scope.
> 
> For the client, this means that a callback is never invoked with a context that
> the user has freed.  For the user to know when they can free their context, my
> recommendation was to block iw_destroy_cm_id() until all outstanding callbacks
> had completed, and no new callbacks would be invoked.
> 
> For the iwcm, I was suggesting to use the same model, but I'm fine with an
> alternate approach, as long as it's relatively simple to verify its correctness.

It is the same model for the client, i.e. iw_destroy_cm_id blocks. The
special case is if the callback handler returns !0.

> 
> >IDLE [iw_cm_connect] -->
> >	CONN_SENT <CONNECT_REPLY(accept)> -->
> >		ESTABLISHED
> >
> >IDLE [iw_cm_listen] -->
> >	LISTENING <CONNECT_REQUEST> -->
> >		new_endpoint in CONN_RECV
> >
> >CONN_RECV [iw_cm_accept] -->
> >	CONN_RECV <ESTABLISHED**> --> ESTABLISHED
> >
> >CONN_RECV [iw_cm_reject] -->
> >	IDLE
> >
> >ESTABLISHED [iw_cm_destroy] --> DESTROYING
> >
> >ESTABLISHED <DISCONNECT>--> CLOSING		// normal close
> >
> >CLOSING     <CLOSE>	--> IDLE		// abortive close
> >
> >
> >** On iWARP there is no ESTABLISHED event in the provider. This
> >   event is generated by the IW CM to provide a vehicle for
> >   delivering the passive side connect complete event
> >   to the app via a callback.
> 
> Is it possible to receive either a DISCONNECT or CLOSE event between calling
> accept() and queuing the ESTABLISHED event?  It seems that it is.

No. Either 'event' in the adapter (receipt of FIN or RST) would result
in a subsequent failure of the accept of reject request and not the
delivery of an event through the provider because the connection is not
yet ESTABLISHED. There is, however, logic in the destroy path to deal
with the case of the client callback returning !0 without calling accept
or reject. However, this could be considered an application error...

> >I didn't think that IB worked this way either since the user can
> >'deallocate' the context by returning a non-zero value from a callback.
> 
> This results in calling the standard ib_destroy_cm_id() routine after returning
> from the callback.  The user is unable to call this routine directly from their
> callback because a reference is held on the cm_id while in the callback, which
> would result in deadlock.  The alternative is for the user to spawn a thread to
> call destroy.
> 
> >One more important thing to note is that the CLOSE event
> >holds the last reference on the cm_id and therefore a destroy
> >initiated in the event thread cannot wait until the refcount
> >goes to zero because the event that has this reference may
> >not have occurred yet and will deadlock the event thread. The
> >purpose of the destroy_flags in the cm_id_priv is to note
> >whether the client thread or the event thread initiated the
> >destruction of the cm_id. If it was the client thread, then
> >the CLOSE event will remove the last reference and wake up the
> >client thread that will then kfree the cm_id. If it was the
> >event thread (via a non-zero return value from a user callback),
> >then the CLOSE event will remove the last reference and kfree
> >the cm_id synchronously.
> 
> This is one of the areas that concerns me.  The thread acquiring the reference
> until the CLOSE event can occur is separate from the thread releasing it.  I'm
> unsure that the synchronization is there to ensure that the acquire will always
> occur before the release.

Actually, I think the event thread both acquires the reference and
releases it. The reference is acquired in one of cm_conn_est_handler or
cm_conn_rep_handler depending on whether it is the active or passive
side. The reference is released in cm_close_handler...all of which are
called on the event thread.

But...that said...I think the combination of event ordering and spin
locks makes this a coincidence and not a functional requirement. 

> 
> - Sean




More information about the general mailing list