[openib-general] Completion callback /teardown race

Rimmer, Todd trimmer at silverstorm.com
Thu Sep 21 08:10:09 PDT 2006


> From: Michael S. Tsirkin [mailto:mst at mellanox.co.il]
> Sent: Wednesday, September 20, 2006 1:14 AM
> To: Tillier, Fabian
> Cc: Rimmer, Todd; openib-general at openib.org
> Subject: Re: Completion callback /teardown race
> 
> Quoting r. Fabian Tillier <ftillier at silverstorm.com>:
> > > There are some differences in HCA behaviour with regard to
> > > ib_req_notify_cq.  Mellanox HCAs will provide a callback/interrupt
if
> > > the CQ is not empty at this point (in which case the poll_cq's
after
> the
> > > notify are optional).
> > >
> > > However the behaviour defined in the IBTA spec indicates that
> > > ib_req_notify_cq will cause a callback/interrupt only on the next
CQE
> > > which arrives, hence to be portable the poll_cq loop after
> > > ib_req_notify_cq is necessary to cover any CQEs which arrived
between
> > > the prior poll and the ib_req_notify_cq.
> >
> > I remember a while ago a mention that the behavior of the Mellanox
> > HCAs could be controlled in the firmware, so that they would follow
> > the IBTA spec defined behavior.
> 
> There's a mistake here. Mellanox HCAs will generate an event upon
> ib_req_notify_cq only if new completions has arrived after the
previous
> event
> has been reported.
> 
> AFAIK this is IBTA spec compliant.

I agree the Mellanox HCA is spec compliant.

The difference between HCAs is how they handle the situation:

CQE arrives
HCA generates event/callback
poll CQ, remove CQE
poll CQ, detect CQ is empty
CQE arrives
ib_req_notify_cq

At this point a Mellanox HCA will generate an event (as Michael
indicates, an unprocessed CQE has arrived since the previous event).

Many other HCAs given this situation will not generate an event, instead
they generate an event when a CQE arrives after the ib_req_notify_cq.

Hence to support other HCAs, ULPs should poll the CQ after the
ib_req_notify_cq.

On any HCA model, ULPs should be prepared for a callback where the CQ is
empty.  There are situations in either approach which can introduce an
extra callback after the CQ has been emptied.

Todd Rimmer




More information about the general mailing list