[openib-general] Re: [PATCH 3/3] iWARP CM
Caitlin Bestler
caitlinb at broadcom.com
Thu Mar 16 10:16:17 PST 2006
openib-general-bounces at openib.org wrote:
> Tom Tucker wrote:
>> The iWARP CM prevents this from happening by having a state
>> (DESTROYING) that prevents events from being delivered after
>> iw_destroy_cm_id has returned. This approach avoids having either the
>> kernel application thread or the event thread blocked while waiting
>> for the last reference to be released.
>
> You need to consider destroy being called by a separate
> thread from the one processing events. An event can be
> generated, queued, and just about to callback the user when
> the user calls destroy. Place the event thread at the top of
> the user's callback routine. There's no way to halt the
> execution of the callback at this point. Now let the thread
> calling destroy execute and return to the user. The callback
> code is still running, but the user is not even aware at this point.
>
>> Unlike the IB side, the iWARP side has orderly shutdown semantics
>> that can delay the delivery of the CLOSE event for minutes. With this
>> implementation, life goes on and the object simply stays around until
>> the last reference is removed.
>
> Even in IB, there's a CM object that hangs around after the
> user has called destroy, and it has returned. This is fine;
> the user is unaware of this object.
>
>> Please look at the handling of events in cm_event_handler. If the
>> state is DESTROYING, events are not queued for delivery. This handles
>> events that are generated by the provider after iw_destroy_cm_id has
>> returned.
>
> The problem is when the user calls destroy at the same time
> that an event is being generated. If the event gets there
> first, a callback will run. Destroy does not wait for that
> callback to complete before returning. Hopefully, I've
> explained the situation a little better.
>
I agree that the protocol difference does nothing to address
the problem of an unreaped event that was generated by/for a now
deceased object. The same problem exists for QPs.
There are at least two approaches to this: check for deleted
objects when reaping the event (and discarding those associated
with deleted objects), or defer the final deletion until all
completions *have* been reaped.
More information about the general
mailing list