[openib-general] Completion callback /teardown race

Rimmer, Todd trimmer at silverstorm.com
Tue Sep 19 14:47:19 PDT 2006


> From: Roland Dreier
> Sent: Tuesday, September 19, 2006 5:17 PM
> To: Eric Barton
> Cc: openib-general at openib.org
> Subject: Re: [openib-general] Completion callback /teardown race
> 
> 
> I'll have more to say on this in the context of IPoIB and NAPI
> shortly, since I've been thinking about this issue myself.
> 
> The ipath driver implements only the weaker semantics guaranteed by
> the IBA spec -- ie an event is generated if a completion is added
> after the request for notification.  And I don't know what ehca and
> amso1100 implement to be honest.
> 
> (The Mellanox semantics are conforming though, since it's not
> well-defined exactly when a completion is added to a CQ if no one
> looks...)

An approach we implemented a few years ago in our proprietary stack was
a new verb (in addition to poll_cq and notify_req): poll_and_notify (we
called it iba_poll_and_rearm).

This verb always did a poll_cq, but if the CQ was drained it then did a
rearm of the CQ.  The return value from the call indicated what the next
step for the caller should be:
- SUCCESS - call poll_and_notify again (CQE returned)
- COMPLETED - nothing to do after this CQE (CQE returned, rearmed, no
need to poll anymore)
- POLL_NEEDED - loop on poll (CQE returned, rearmed, need to poll_cq til
empty)
- NOT_DONE - nothing more to do, no CQE (no CQE returned, rearmed, CQ
still empty, no need to poll anymore)
- error (invalid call, etc)

callback would loop on poll_and_notify as long as SUCCESS was returned.
afterwhich if POLL_NEEDED had been returned, it would loop on poll_cq

This approach provided 2 advantages:
1. for performance an extra 1-2 calls into the HCA driver per callback
were avoided.  The win here was saving some spin locks (in high CQE rate
drivers like IPoIB this was noticible).
2. on HCAs such as mellanox, POLL_NEEDED was never returned and the
caller never did unnecessary polls, however the caller and API was also
able to handle HCAs which did not have the mellanox semantics.

Todd Rimmer


     




More information about the general mailing list