[ofa-general] Re: [RFC][PATCH] last wqe event handler patch

Tue Jun 24 22:53:31 PDT 2008

Roland Dreier <rdreier at cisco.com> wrote on 06/24/2008 08:11:16 PM:

> I don't understand the problem this patch is solving (insufficient
> detail in the changelog), but I don't see how it can be correct:
>
>  >       struct list_head  rx_flush_list; /* state: FLUSH, drain
notstarted */
>  > -     struct list_head  rx_drain_list; /* state: FLUSH, drain started
*/
>  >       struct list_head  rx_reap_list;  /* state: FLUSH, drain done */
>
> It seems to me we need all three of these states to keep track of QPs
> properly: first one means "last WQE reached", second one means "send
> posted after last WQE reached" and last on means "completion seen for
> send posted after last WQE reached".
>
>  - R.

The issue is if rx_drain_list has only one QP context on that list, then
it's not necessary to have a rx_drain_list; if rx_drain_list has mulitple
QP contexts on that list, then move all elements from that list to reap
list from a particular QP drain WR cqe is wrong.

if (unlikely(wr_id >= ipoib_recvq_size)) {
                if (wr_id == (IPOIB_CM_RX_DRAIN_WRID & ~(IPOIB_OP_CM |
IPOIB_OP_RECV))) {
                        spin_lock_irqsave(&priv->lock, flags);
                        list_splice_init(&priv->cm.rx_drain_list,
&priv->cm.rx_reap_list);
                        ipoib_cm_start_rx_drain(priv);
                        queue_work(ipoib_workqueue,
&priv->cm.rx_reap_task);
                        spin_unlock_irqrestore(&priv->lock, flags);

Also there is a possible race between timeout reap and last wqe reached
reap, so check the status of that QP context is necessary.

Hopefully it's clear.

Thanks
Shirley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20080624/dc843454/attachment.html>