[ofa-general] Re: [PATCH for-2.6.21] mthca: QP reset race fixup

Wed Mar 21 01:12:33 PDT 2007

> Quoting Michael S. Tsirkin <mst at dev.mellanox.co.il>:
> Subject: Re: [PATCH for-2.6.21] mthca: QP reset race fixup
> 
> > Quoting Michael S. Tsirkin <mst at dev.mellanox.co.il>:
> > Subject: Re: [PATCH for-2.6.21] mthca: QP reset race fixup
> > 
> > > Quoting Roland Dreier <rdreier at cisco.com>:
> > > Subject: Re: [PATCH for-2.6.21] mthca: QP reset race fixup
> > > 
> > >  > This fixes openfabrics bugzilla 394:
> > >  > - Use common EQ for command interface and async events
> > >  > - Clean CQ after moving QP to reset
> > > 
> > > This is a little terse -- an ideal changelog entry would explain what
> > > the bug is, what is being changed to fix it, and why that fixes the issue.
> > > I'll try to fix it up...
> > > 
> > >  > This also fixes a potential crash in ipoib cm:
> > >  > - sync with completion event ISR after QP is reset
> > >  >   to prevent ULP from getting and using QP pointer and/or WRID
> > >  >   after they are freed.
> > > 
> > >  > We can rip the lines that sync with MTHCA_EQ_COMP out if you think
> > >  > the issue needs to be dealt with in some other way - and I agree
> > >  > this is only good for ULPs that do all their polling
> > >  > inside the ISR, but at least this covers all in-kernel code.
> > > 
> > > I don't really like this change, although maybe it's the right thing
> > > to do.  But can you explain what IPoIB CM is doing that would cause it
> > > to run into trouble?  I'd like to see if there's a better solution.
> > > It just seems strange to me to add the assumption that destroying a QP
> > > makes sure that all running CQ callbacks are done.
> > 
> > Look at ipoib_cm_stale_task:
> > +               ib_destroy_cm_id(p->id);
> > +               ib_destroy_qp(p->qp);
> > 
> > and then
> > +       if (!likely(wr_id & IPOIB_CM_RX_UPDATE_MASK)) {
> > +               p = wc->qp->qp_context;
> > 
> > This wc->qp->qp_context might use QP after free.
> > 
> > > 
> > > If we change to NAPI (so that CQs are polled asynchronously) does that
> > > readd the same bug?
> > 
> > Hmm. Yes.
> > In hindsight, it was probably better to put qp_context directly in ib_wc
> > instead of the qp pointer.
> > 
> > Then ipoib could set some flag in the structure pointed to by qp_context.
> > 
> > My guess this would be too big a change for 2.6.21. What do you think?
> 
> To clarify: syncing with the completion IRQ will be needed anyway in
> for non-NAPI mode.
> 
> For NAPI we will be able to sync with NAPI after we set the flag
> 	in qp_context.

To sync with NAPI, we can take the poll_lock.
I imagine something like this before destroying QP

	spin_lock(&dev->npinfo->poll_lock);
	p->dead = 1;
	spin_unlock(&dev->npinfo->poll_lock);

and after poll cq:
	if (!likely(wr_id & IPOIB_CM_RX_UPDATE_MASK)) {
		p = wc->qp_context;
		if (!p->dead &&
		    time_after_eq(jiffies, p->jiffies + IPOIB_CM_RX_UPDATE_TIME)) {
		}
	}

-- 
MST