[openib-general][PATCH][kdapl]: FMR and EVD patch
James Lentini
jlentini at netapp.com
Mon Aug 29 09:51:51 PDT 2005
Hi Guy,
I agree with you on the problems poised by the current interface. I
hope we can find a solution that fixes the problem. Note that the same
problem must be handled by a ULP using the native verbs.
I still think that there may be a race condition with this patch.
Here's the scenario I'm concerned about:
- Receive an evd upcall
- Disable evd upcall policy
- Wakeup polling thread
- Dequeue all events
- Enable evd upcall policy by:
1. Call dapl_evd_modify_upcall() to enable the evd upcall
2. Obtain the EVD spin lock via spin_lock_irqsave, thus
disabling local interrupts
3. Check that the EVD's ring buffer is empty (there are no DAPL
"software" events)
4. A DTO completion occurs on the EVD's CQ
5. Enable the CQ's upcall via ib_req_notify_cq()
If I understand you correctly, you are asserting that event #4, the
CQ's DTO completion, cannot occur because the local interrupts are
disabled by spin_lock_irqsave(). Have I understood you correctly?
My belief is that the completion will occur on the card regardless of
the interrupt state. Can you provide me with a reference that
guarantees this will not happen?
james
On Thu, 18 Aug 2005, Guy German wrote:
> Hi James,
>
> I will try to explain the reason behind this patch:
>
> In IB, a normal working flow, for a consumer, is:
> - Receive a CQ notification callback
> - Wakeup polling thread
> - Poll for completion (empty the queue)
> - Request completion notification
>
> There is no problem here.
>
> In kdapl, however, the consumer will keep getting upcalls, until he
> sets the upcall policy to disable. So a working flow will be:
> - Receive an evd upcall
> - Disable evd upcall policy
> - Wakeup polling thread
> - Dequeue all evds
> - Enable evd upcall policy
>
> There is a race here: A completion can come after the last dequeu
> and before the Enabling. The provider wont call for the consumer
> (policy is disabled) and the consumer would not dequeu any more
> because he knows the queue is empty.
>
> I think it is a very bad idea, to solve this race by adding another
> evd_dequeue after you enable the upcall policy. If you do that you
> would have a polling thread (because while you dequeue one
> completion you can have many more following) and at the same time
> you will receive upcall from the dapl provider. Beside the fact that
> this is an expensive and unnecessary context switch you have an
> upcall and a thread racing. You will have a situation that the
> upcall has an event at hand and the thread has an event, both not
> handled yet - you will have to queue them again internally or
> something to keep the order. And I think that is only a partial list
> of the problems in this case.
>
> SO
>
> My suggestion is simple, it solves the race, it saves the
> unnecessary context switch and it spares the complexity from the
> consumer side. The solution is to notify the consumer when he tries
> to enable upcall policy, that the queue is actually not empty, and
> force him to continue polling (in the same thread context he is
> now). dat_evd_modify_upcall is guarded by a spin_lock_irqsave, when
> it checks the queue and so the race would not occur.
>
> BTW,
> Im not sure if it is still the case, but I think that one of the
> ulps in openib, did not use a kernel thread for dequeu-ing. This is
> a very bad design, as the upcall can be polling for *long* periods
> of time, in a tasklet/interrupt context.
>
> Thats it
> Sorry for the long mail I hope It was not to blur
>
> Guy.
>
>
>
>
>
> -----Original Message-----
> From: James Lentini [mailto:jlentini at netapp.com]
> Sent: Thu 8/18/2005 10:28 PM
> To: Guy German
> Cc: Openib
> Subject: Re: [openib-general][PATCH][kdapl]: FMR and EVD patch
>
>
>
> Hi Guy,
>
> The one piece of this patch that remains unaccepted is:
>
> Index: ib/dapl_evd.c
> ===================================================================
> --- ib/dapl_evd.c (revision 3136)
> +++ ib/dapl_evd.c (working copy)
> @@ -1028,6 +1028,7 @@
> {
> struct dapl_evd *evd;
> int status = 0;
> + int pending_events;
>
> evd = (struct dapl_evd *)evd_handle;
> dapl_dbg_log (DAPL_DBG_TYPE_API, "%s: (evd=%p, upcall_policy=%d)\n",
> @@ -1035,14 +1036,25 @@
>
> spin_lock_irqsave(&evd->common.lock, evd->common.flags);
> if ((upcall_policy != DAT_UPCALL_TEARDOWN) &&
> - (upcall_policy != DAT_UPCALL_DISABLE) &&
> - (evd->evd_flags & DAT_EVD_DTO_FLAG)) {
> - status = ib_req_notify_cq(evd->cq, IB_CQ_NEXT_COMP);
> - if (status) {
> - printk(KERN_ERR "%s: dapls_ib_completion_notify failed "
> - "(status=0x%x)\n",__func__, status);
> + (upcall_policy != DAT_UPCALL_DISABLE)) {
> + pending_events = dapl_rbuf_count(&evd->pending_event_queue);
> + if (pending_events) {
> + dapl_dbg_log(DAPL_DBG_TYPE_WARN,
> + "%s: (evd %p) there are still %d pending "
> + "events in the queue - policy stays disabled\n",
> + __func__, evd_handle, pending_events);
> + status = -EBUSY;
> goto bail;
> }
> + if (evd->evd_flags & DAT_EVD_DTO_FLAG) {
> + status = ib_req_notify_cq(evd->cq, IB_CQ_NEXT_COMP);
> + if (status) {
> + printk(KERN_ERR "%s: dapls_ib_completion_notify"
> + " failed (status=0x%x) \n",__func__,
> + status);
> + goto bail;
> + }
> + }
> }
> evd->upcall_policy = upcall_policy;
> evd->upcall = *upcall;
>
> The IB analog to this function, ib_req_notify_cq(), does not require
> that the CQ be empty. The kDAPL specification does not define an empty
> EVD as a requirement for modifying the upcall and previous
> implementations of the API have not made this requirement.
>
More information about the general
mailing list