[ofa-general] iwarp-specific async events
Steve Wise
swise at opengridcomputing.com
Wed Apr 30 09:45:55 PDT 2008
Hey Roland,
I'm looking for a good way to trigger iwarp QP flushing on a normal
disconnect for user mode QPs. The async event notification provider ops
function is one way I can do it easily with the currently
infrastructure, if we add some new event types. For example, if a
fatal error occurs on a QP which causes the connection to be aborted,
then the kernel driver will mark the user qp as "in error" and post a
FATAL_QP event. When the app reaps that event, the libcxgb3 async event
ops function will flush the user's qp. However for a normal non fatal
close, no async event is posted. But one should be. The iWARP verbs
specify many async event types that I think we need to add at some
point. Case in point:
LLP Close Complete (qp event) - The TCP connection completed and no SQ
WQEs were flushed (normal close)
There is a whole slew of other events. The above event, however, is key
in that libcxgb3 could trigger a qp flush when this event is reaped by
the application. Currently, the flushing of the QP is only triggered by
fatal connections errors as described above and/or if the application
tries to post on a QP that has been marked in error by the kernel.
However, If the app does neither, then the flush never happens.
There are other ways to tackle this cxgb3 problem:
- enabling the providers to get a callback on rdma-cm event reaping. So
reaping the DISCONNECTED event would cause the qp to be flushed.
- I could hack this into the cxgb3 provider kernel driver so it can mark
a user mode CQ with state that tells it to go flush any QPs it owns that
are in error. Thus the next time the application polls, the poll logic
would go flush any qps in error.
I'm opting for the simplest change, which I think is adding new async
events and changing the iwarp driver to post them at the right times.
Thoughts?
Thanks,
Steve.
More information about the general
mailing list