[openib-general][PATCH][kdapl]: FMR and EVD patch
James Lentini
jlentini at netapp.com
Tue Aug 30 07:40:24 PDT 2005
I'm not comfortable with a solution that relies on vendor specific
behavior for such a critical mechanism.
Given that the OpenIB ib_req_notify_cq() verb conforms to the IBTA
spec's semantics
http://www.mail-archive.com/openib-general%40openib.org/msg08935.html
and that the development of three other low-level OpenIB drivers
(Pathscale, IBM, Ammasso) have been announced, I believe relying on
this behavior would be a mistake.
What if dapl_evd_modify_upcall() worked as follows
dapl_evd_modify_upcall
lock evd with spin_lock_irqsave
if CQ upcalls need to be enabled
ib_req_notify_cq
setup the evd upcall
unlock evd with spin_unlock_irqrestore
if ib_peek_cq reports unreaped work completions
call dapl_evd_dto_callback
I realize that the call to dapl_evd_dto_callback() will potentially be
racing with a CQ upcall, but I believe that the logic in
dapl_evd_dto_callback() handles that correctly.
james
On Mon, 29 Aug 2005, Guy German wrote:
> James Lentini wrote:
> > I agree with you on the problems poised by the current interface. I
> > hope we can find a solution that fixes the problem.
> > Note that the same problem must be handled by a ULP using the native verbs.
>
> I don't think we have the same problem in the verbs.
> In the currently Mellanox hw (which is AFAIK the only available hw in openib)
> there is no race at all (because of the proprietary, more considerate,
> completion notification implementation).
>
> - Receive a CQ notification callback
> - Wakeup polling thread
> - Poll for completion (empty the queue)
> - Request completion notification
> [you will get a completion notification even for old completions on the queue]
> - exit thread
>
> In the case of other, more harsh ib compliant future hw implementation
> Request completion Notification extended verb could encapsulate:
> - request CQ notification
> - if cq !empty request CQ notification _again_
> (note that you are not *polling* the cq just checking the queue.
> This is different then draining the evd "one more time")
>
> And the race is solved.
> Indeed, it is not as efficient as sparing the context switch
> (to interrupt and back to thread) altogether.
>
> >I still think that there may be a race condition with this patch.
> >Here's the scenario I'm concerned about:
> > - Receive an evd upcall
> > - Disable evd upcall policy
> > - Wakeup polling thread
> > - Dequeue all events
> > - Enable evd upcall policy by:
> > 1. Call dapl_evd_modify_upcall() to enable the evd upcall
> > 2. Obtain the EVD spin lock via spin_lock_irqsave, thus
> > disabling local interrupts
> > 3. Check that the EVD's ring buffer is empty (there are no DAPL
> > "software" events)
> > 4. A DTO completion occurs on the EVD's CQ
> > 5. Enable the CQ's upcall via ib_req_notify_cq()
> >
> >If I understand you correctly, you are asserting that event #4, the
> >CQ's DTO completion, cannot occur because the local interrupts are
> >disabled by spin_lock_irqsave(). Have I understood you correctly?
>
> Not quite. The *consumers upcall* would not be called, due to the irq disable.
> The race would not occur, OTOH, because the Mellanox hw will initiate a
> completion notification even if the completions in the cq arrived before
> the notification request.
> If you want to be more ib compliant, for future possible implementations,
> you can apply the extended-notify-routine (as mentioned above).
>
> > My belief is that the completion will occur on the card
> > regardless of the interrupt state.
>
> True, but the consumer will be notified only as soon as the irq
> is enabled again
>
> > Can you provide me with a reference that guarantees this
> > will not happen?
>
> Im not saying that it wont ;) but I don't think there will be a race...
>
> Guy
>
More information about the general
mailing list