[ofa-general] [Bug 508] IPoIB CM multicast is hogging interrupts

Mon Apr 30 12:33:42 PDT 2007

"Michael S. Tsirkin" <mst at dev.mellanox.co.il> wrote on 04/30/2007 11:51:18 
AM:

> Quoting Pradeep Satyanarayana <pradeep at us.ibm.com>:
> Subject: Re: [ofa-general] [Bug 508] IPoIB CM multicast is hogging 
interrupts
> 
> > I think this trick I just came up with is a simpe way to prevent
> > IPoIB TX from hogging interrupts, even without NAPI. And it might 
> be a better
> > way to solve the problem for IPoIB CM TX than using a common cq
> > as my previous patch did.
> > >  static void ipoib_cm_tx_completion(struct ib_cq *cq, void *tx_ptr)
> > >  {
> > >     struct ipoib_cm_tx *tx = tx_ptr;
> > > -   int n, i;
> > > +   int n, i, cnt = 0;
> > > 
> > >     ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);
> > >     do {
> > >        n = ib_poll_cq(cq, IPOIB_NUM_WC, tx->ibwc);
> > > +      cnt += n;
> > >        for (i = 0; i < n; ++i)
> > >           ipoib_cm_handle_tx_wc(tx->dev, tx, tx->ibwc + i);
> > > -   } while (n == IPOIB_NUM_WC);
> > > +   } while (n == IPOIB_NUM_WC && cnt < ipoib_sendq_size);
> > >  }
> > 
> > This change might exit tx_completion sooner -how does that prevent
> > hogging interrupts (without NAPI)? I am not clear about that.
> 
> By returning from interrupt handler after a finite number of
> completions, rather than polling CQ potentially indefinitely.
> 

Ok, this was intended to mean hogging CPU while in interrupt context. This 
will 
surely help that case. Infact systems that support round robin 
distribution of 
interrupts across CPUs will likely see a good distribution of CPU load.

Additionally, this might also solve the soft lockup problem that was 
discussed
here previously (except for the issue with the lo device).

Pradeep
pradeep at us.ibm.com