[openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()

Thu Nov 16 07:15:23 PST 2006

openib-general-bounces at openib.org wrote on 11/14/2006 03:18:23 PM:

>     Shirley> The rotting packet situation consistently happens for
>     Shirley> ehca driver. The napi could poll forever with your
>     Shirley> original patch. That's the reason I defer the rotting
>     Shirley> packet process in next napi poll.
> 
> Hmm, I don't see it.  In my latest patch, the poll routine does:
> 
> repoll:
>    done  = 0;
>    empty = 0;
> 
>    while (max) {
>       t = min(IPOIB_NUM_WC, max);
>       n = ib_poll_cq(priv->cq, t, priv->ibwc);
> 
>       for (i = 0; i < n; ++i) {
>          if (priv->ibwc[i].wr_id & IPOIB_OP_RECV) {
>             ++done;
>             --max;
>             ipoib_ib_handle_rx_wc(dev, priv->ibwc + i);
>          } else
>             ipoib_ib_handle_tx_wc(dev, priv->ibwc + i);
>       }
> 
>       if (n != t) {
>          empty = 1;
>          break;
>       }
>    }
> 
>    dev->quota -= done;
>    *budget    -= done;
> 
>    if (empty) {
>       netif_rx_complete(dev);
>       if (unlikely(ib_req_notify_cq(priv->cq,
>                      IB_CQ_NEXT_COMP |
>                      IB_CQ_REPORT_MISSED_EVENTS)) &&
>           netif_rx_reschedule(dev, 0))
>          goto repoll;
> 
>       return 0;
>    }
> 
>    return 1;
> 
> so every receive completion will count against the limit set by the
> variable max.  The only way I could see the driver staying in the poll
> routine for a long time would be if it was only processing send
> completions, but even that doesn't actually seem bad: the driver is
> making progress handling completions.
> 

Is it possible that when one gets into the "rotting packet" case, the 
quota
is at or close to 0 (on ehca). If in the cass it is 0 and 
netif_rx_reschedule() 
case wins (over netif_rx_schedule()) then it keeps spinning unable to 
process 
any packets since the undo parameter for netif_reschedule() is 0.

If netif_rx_reschedule() keeps winning for a few iterations then the 
receive
queues get full and dropping packets, thus causing a loss in performance.

If this is indeed the case, then one option to try out may be is to change 

the undo parameter of netif_rx_rechedule()to either IB_WC or even 
dev->weight.

>     Shirley> It does help the performance from 1XXMb/s to 7XXMb/s, but
>     Shirley> not as expected 3XXXMb/s.
> 
> Is that 3xxx Mb/sec the performance you see without the NAPI patch?
> 
>     Shirley> With the defer rotting packet process patch, I can see
>     Shirley> packets out of order problem in TCP layer.  Is it
>     Shirley> possible there is a race somewhere causing two napi polls
>     Shirley> in the same time? mthca seems to use irq auto affinity,
>     Shirley> but ehca uses round-robin interrupt.
> 
> I don't see how two NAPI polls could run at once, and I would expect
> worse effects from them stepping on each other than just out-of-order
> packets.  However, the fact that ehca does round-robin interrupt
> handling might lead to out-of-order packets just because different
> CPUs are all feeding packets into the network stack.
> 
>  - R.
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit 
http://openib.org/mailman/listinfo/openib-general
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20061116/d284ef19/attachment.html>