[openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

Tue Aug 29 06:09:08 PDT 2006

Quoting r. Sean Hefty <mshefty at ichips.intel.com>:
> I believe that this tracking is done, and is reported to the user by the 
> timewait exit event.  QP transitions are the responsibility of the user.
> 
> This is related to a problem that Arlin and I have been discussing.  There's 
> nothing that the CM does to prevent the QP from being destroyed, especially for 
> a usermode application.  The CM invokes a callback once a connection exits 
> timewait, indicating to the user that the QP may now be destroyed.  But if an 
> application crashes, uverbs automatically destroys the QP.
> 
> We may need better coordination between the CM and verbs wrt timewait to handle 
> userspace QPs, but this depends on this change.

And userspace is not the only one affected - CMA also is missing
timewait handling, and it is quite hard to fit one there.

Here's an idea:
how about we move the whole timewait thing to low level driver,
starting timer automatically upon QP destroy?

At least in mthca, it makes sense: actual QP in reset state
takes up resources, while all we need for QP in timewait is a slot
in QP table, to prevent the QP number from being reused.

So we'll be saving a lot of memory and all ULPs will be much simpler
since they'll be able to just destroy the QP and forget about it,
without headache of TIMEWAIT_EXIT events.

This is actually very easy to implement: all we need is a per-device list of
QPNs to free, and a work structure to schedule delayed work on QP destroy.

Roland, what do you say?

-- 
MST