[openib-general] [PATCH] IB/ipoib: NAPI

Mon Sep 25 10:58:54 PDT 2006

Quoting r. Roland Dreier <rdreier at cisco.com>:
> Subject: Re: [PATCH] IB/ipoib: NAPI
> 
>     Michael> But this has a disadvantage over the device-wide flag:
>     Michael> when flag is device-wide, we can just have 2 polling
>     Michael> routines - with and without peek - and select the correct
>     Michael> one at device open depending on the hardware
>     Michael> capabilities.  Thus we can avoid a conditional branch on
>     Michael> the fast path, which I think is nice.
> 
> Yeah, but I can't make up my mind whether two polling routines is a
> good thing or a bad thing.  We get a very specific optimization, but
> we have two copies of the same code then.

Well, with a flag the ULP can decide what it wants to do,
we are not forcing anything here.

>     Michael> On a separate note - ib_req_notify_cq is also testing the
>     Michael> lost_event_possible flag - so now we have 2 conditional
>     Michael> branches on fast path, and this hurts all ULPs. Ugh.
> 
> I suspect that the cost here is minimal -- lost_event_possible is
> going to be in a register, etc.

Hmm, since we are passing it by pointer to a function
called through a pointer, I don't see how can gcc
move it out of memory into register. Am I wrong?

>     Michael> If we extend the interface, I would rather make a new
>     Michael> call ib_req_notify_and_peek_cq(truct ib_cq *cq, enum
>     Michael> ib_cq_notify cq_notify) that returns 0 on empty CQ, 1 on
>     Michael> non-empty and negative on error.
> 
> And again, I don't want to make the interface too fat...

Well, lots of flags that you are required to implement
amounts to the same thing from low level driver developer
perspective, isn't that right?

> There are a few of tradeoffs here: microoptimization
> vs. maintainability, IPoIB & NAPI vs. all other ULPs...

I just find a flag + conditional peek a much simpler approach.
Since all our testing is done on mthca anyway, almost
all approaches amount to doing a NOP in various ways for us.

So I would suggest
- get Eli's patch with simple flag into shape & working on all hardware,
  push into git.
- people interested in specific hardware test performance and propose patches
  to improve it even further.

Does this sound good?

-- 
MST