[ofa-general] Re: [PATCH] IB/mthca: recv poll cq optimization
Michael S. Tsirkin
mst at mellanox.co.il
Sun Mar 4 12:10:10 PST 2007
> Quoting Roland Dreier <rdreier at cisco.com>:
> Subject: Re: [PATCH] IB/mthca: recv poll cq optimization
>
> > All good recv work requests generate HW completions in FIFO order, so we can use
> > rq->tail rather than hardware data. In this way, we save a branch on data path
> > for recv completions (branch is still there for send completions).
> >
> > Roland, what do you think? This increases the overall code size but I think the
> > extra code is on the error CQE handling path. BTW, since most kernel QPs seem
> > not to use selective signaling, it might be worth it to optimize send
> > completions in a similiar way in case selective singaling is disabled on QP.
>
> Do you have any measurements that say this helps? Having a bigger
> I-cache footprint is really globally worse for the system, so I don't
> like this part:
>
> > + if (unlikely(is_error)) {
> > + if (!is_send && !(*cur_qp)->ibqp.srq) {
> > + s32 wqe = be32_to_cpu(cqe->wqe);
> > + wqe_index = wqe >> wq->wqe_shift;
> > + /*
> > + * WQE addr == base - 1 might be reported in receive completion
> > + * with error instead of (rq size - 1) by Sinai FW 1.0.800 and
> > + * Arbel FW 5.1.400. This bug should be fixed in later FW revs.
> > + */
> > + if (unlikely(wqe_index < 0))
> > + wqe_index = wq->max - 1;
> > + }
> >
> > - if (is_error) {
>
> is there any way to move that into handle_error_cqe() so that it's
> definitely out of the way for the normal path?
I'll look into that.
> In fact, why do we
> need this code at all with your change -- aren't RQ error completions
> reported in FIFO order too?
Good point.
> I'm not sure that it's worth testing whether a SQ has selective
> signaling or not -- after all, that's just changing one conditional
> branch for another. And in fact, looking at the code, I think we
> could rewrite
>
> if (wq->last_comp < wqe_index)
> wq->tail += wqe_index - wq->last_comp;
> else
> wq->tail += wqe_index + wq->max - wq->last_comp;
>
> as just
>
> wq->tail += (wqe_index + wq->max - wq->last_comp) & (wq->max - 1);
>
> and avoid the conditional in a simpler way. (I've not checked that
> match but it looks right to me)
This won't work for Tavor where wq.max is not a power of 2.
--
MST
More information about the general
mailing list