[ofa-general] Re: iwarp-specific async events

Steve Wise swise at opengridcomputing.com
Wed Apr 30 11:28:55 PDT 2008


Roland Dreier wrote:
>  > I'm looking for a good way to trigger iwarp QP flushing on a normal
>  > disconnect for user mode QPs.  The async event notification provider
>  > ops function is one way I can do it easily with the currently
>  > infrastructure, if we add some new event types.   For example, if a
>  > fatal error occurs on a QP which causes the connection to be aborted,
>  > then the kernel driver will mark the user qp as "in error" and post a
>  > FATAL_QP event.  When the app reaps that event, the libcxgb3 async
>  > event ops function will flush the user's qp.  However for a normal non
>  > fatal close, no async event is posted.  But one should be.  The iWARP
>  > verbs specify many async event types that I think we need to add at
>  > some point.  Case in point:
>  > 
>  > LLP Close Complete  (qp event) - The TCP connection completed and no
>  > SQ WQEs were flushed (normal close)
>
> Yeah, it makes sense just to add any iWARP events that make sense and
> don't fit the existing set of IB events.  We already have IB-specific
> stuff for path migration etc.
>
>  > There is a whole slew of other events.  The above event, however, is
>  > key in that libcxgb3 could trigger a qp flush when this event is
>  > reaped by the application.  Currently, the flushing of the QP is only
>  > triggered by fatal connections errors as described above and/or if the
>  > application tries to post on a QP that has been marked in error by the
>  > kernel.   However, If the app does neither, then the flush never
>  > happens.  
>
> On the other hand, how does cxgb3 know when an application has reaped
> the event?  Do we need to add code to the uverbs module to know when an
> async event has reached userspace?
>
>   
I meant libcxgb3, not the kernel modules. The kernel driver knows the 
connection went down and the qp needs flushing. That's who posted the 
async event. The driver just needs a way to kick the library to do the 
flush because the kernel driver doesn't cannot touch the user structs 
(without painful synchronization). So the library will discover this 
when the app reaps the async event via the context ops async_event 
function that libcxgb3 registers.

Steve.



More information about the general mailing list