[openib-general] design for communication established affiliated asynchronous event handling
Or Gerlitz
ogerlitz at voltaire.com
Thu Jun 15 01:26:22 PDT 2006
Sean Hefty wrote:
> James Lentini wrote:
>> The IBTA spec (volume 1, version 1.2) describes a communication
>> established affiliated asynchronous event.
>> We've seen this event delivered to our NFS-RDMA server and aren't sure
>> what to do with it.
> This event is delivered to the verbs consumer, since it occurs on the QP. It's
> expected that the consumer will call ib_cm_establish. Although, I would guess
> that you can probably ignore the event, under the assumption that the RTU will
> eventually be received by the local CM.
Sean,
The cma/verbs consumer can't just ignore the event since its qp state is
still RTR which means an attempt to tx replying the rx would fail.
On the other hand it can't call ib_cm_establish since the CMA does not
expose an API for that, nor the CM can register a cb to get this event
and emulate an RTU reception since the CMA is the one to create the QP
and the CMA consumer providing the qp_init_attr along with event handler...
I suggest the following design: the CMA would replace the event handler
provided with the qp_init_attr struct with a callback of its own and
keep the original handler/context on a private structure.
On the delivery of IB_EVENT_COMM_EST event, the CMA would call down the
CM to emulate RTU reception (ib_cm_establish) and then call up the
consumer original handler, typical CMA consumers would just ignore this
event, i think.
The CM should be able to allow ib_cm_established to be called in the
context over which the event handler is called (or jump the treatment to
higher context). The CM must also ignore the actual RTU if it arrives
later/in parallel to when ib_cm_establish was called.
By this design the verbs consumer is guaranteed to always get
RDMA_CM_EVENT_ESTABLISHED no matter if the RTU is just late or never
arrives but it still can get a CQ RX completion(s) before getting the
CMA established event; in that case it can queue these completion
elements for the short time window before the established event arrives
and then process them.
A design similar to that was implemented at the Voltaire gen1 stack and
it works in production with iSER target and VIBNAL (CFS Lustre NAL for
voltaire gen1 ib) server side.
Does anyone know on what context (hard_irq, soft_irq, thread) are the
event handlers being called?
Or.
More information about the general
mailing list