[openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()
Shirley Ma
xma at us.ibm.com
Wed Nov 15 10:13:15 PST 2006
Roland Dreier <rdreier at cisco.com> wrote on 11/14/2006 03:18:23 PM:
> Shirley> The rotting packet situation consistently happens for
> Shirley> ehca driver. The napi could poll forever with your
> Shirley> original patch. That's the reason I defer the rotting
> Shirley> packet process in next napi poll.
>
> Hmm, I don't see it. In my latest patch, the poll routine does:
>
> repoll:
> done = 0;
> empty = 0;
>
> while (max) {
> t = min(IPOIB_NUM_WC, max);
> n = ib_poll_cq(priv->cq, t, priv->ibwc);
>
> for (i = 0; i < n; ++i) {
> if (priv->ibwc[i].wr_id & IPOIB_OP_RECV) {
> ++done;
> --max;
> ipoib_ib_handle_rx_wc(dev, priv->ibwc + i);
> } else
> ipoib_ib_handle_tx_wc(dev, priv->ibwc + i);
> }
>
> if (n != t) {
> empty = 1;
> break;
> }
> }
>
> dev->quota -= done;
> *budget -= done;
>
> if (empty) {
> netif_rx_complete(dev);
> if (unlikely(ib_req_notify_cq(priv->cq,
> IB_CQ_NEXT_COMP |
> IB_CQ_REPORT_MISSED_EVENTS)) &&
> netif_rx_reschedule(dev, 0))
> goto repoll;
>
> return 0;
> }
>
> return 1;
>
> so every receive completion will count against the limit set by the
> variable max. The only way I could see the driver staying in the poll
> routine for a long time would be if it was only processing send
> completions, but even that doesn't actually seem bad: the driver is
> making progress handling completions.
What I have found in ehca driver, n! = t, does't mean it's empty. If poll
again, there are still some packets in cq. IB_CQ_REPORT_mISSED_EVENTS most
of the time reports 1. It relies on netif_rx_reschedule() returns 0 to exit
napi poll. That might be the reason in poll routine for a long time? I will
rerun my test to use n! = 0 to see any difference here.
>
> Shirley> It does help the performance from 1XXMb/s to 7XXMb/s, but
> Shirley> not as expected 3XXXMb/s.
>
> Is that 3xxx Mb/sec the performance you see without the NAPI patch?
Without NAPI patch, in my test environment ehca can gain around 2800Mb to
3000Mb/s throughput.
> Shirley> With the defer rotting packet process patch, I can see
> Shirley> packets out of order problem in TCP layer. Is it
> Shirley> possible there is a race somewhere causing two napi polls
> Shirley> in the same time? mthca seems to use irq auto affinity,
> Shirley> but ehca uses round-robin interrupt.
>
> I don't see how two NAPI polls could run at once, and I would expect
> worse effects from them stepping on each other than just out-of-order
> packets. However, the fact that ehca does round-robin interrupt
> handling might lead to out-of-order packets just because different
> CPUs are all feeding packets into the network stack.
>
> - R.
Normally for NAPI there should be only one running at a time. And NAPI
process packet all the way to TCP layer by processing packet one by one
(netif_receive_skb()). So it shouldn't lead to out-of-packets even for
round-robin interrupt handling in NAPI. I am still investing this.
Thanks
Shirley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20061115/8772169d/attachment.html>
More information about the general
mailing list