<html><body>

<p><tt>Roland Dreier <rdreier@cisco.com> wrote on 06/24/2008 08:11:16 PM:<br>

<br>

> I don't understand the problem this patch is solving (insufficient<br>

> detail in the changelog), but I don't see how it can be correct:<br>

> <br>

>  >       struct list_head  rx_flush_list; /* state: FLUSH, drain notstarted */<br>

>  > -     struct list_head  rx_drain_list; /* state: FLUSH, drain started */<br>

>  >       struct list_head  rx_reap_list;  /* state: FLUSH, drain done */<br>

> <br>

> It seems to me we need all three of these states to keep track of QPs<br>

> properly: first one means "last WQE reached", second one means "send<br>

> posted after last WQE reached" and last on means "completion seen for<br>

> send posted after last WQE reached".<br>

> <br>

>  - R.<br>

</tt><br>

<tt>The issue is if rx_drain_list has only one QP context on that list, then it's not necessary to have a rx_drain_list; if rx_drain_list has mulitple QP contexts on that list, then move all elements from that list to reap list from a particular QP drain WR cqe is wrong.</tt><br>

<br>

<tt>if (unlikely(wr_id >= ipoib_recvq_size)) {</tt><br>

<tt>                if (wr_id == (IPOIB_CM_RX_DRAIN_WRID & ~(IPOIB_OP_CM | IPOIB_OP_RECV))) {</tt><br>

<tt>                        spin_lock_irqsave(&priv->lock, flags);</tt><br>

<tt>                        list_splice_init(&priv->cm.rx_drain_list, &priv->cm.rx_reap_list);</tt><br>

<tt>                        ipoib_cm_start_rx_drain(priv);</tt><br>

<tt>                        queue_work(ipoib_workqueue, &priv->cm.rx_reap_task);</tt><br>

<tt>                        spin_unlock_irqrestore(&priv->lock, flags);</tt><br>

<br>

<tt>Also there is a possible race between timeout reap and last wqe reached reap, so check the status of that QP context is necessary. </tt><br>

<br>

<tt>Hopefully it's clear.</tt><br>

<br>

<tt>Thanks</tt><br>

<tt>Shirley</tt><br>

</body></html>