<html><body>

<p><tt>Roland Dreier <rdreier@cisco.com> wrote on 06/25/2008 06:44:21 PM:<br>

<br>

>  > Corrects a race condition in ipoib_cm_post_receive_nonsrq()<br>

>  > which allows wqes from one QP context to be post_recv<br>

>  > to another QP context.  The ipoib_cm_post_receive_nonsrq()<br>

>  > saves the wr_id in the shared structure ipoib_cm_dev_priv <br>

>  > making it possible for the saved wr_id to be overwritten by<br>

>  > a subsequent event and posting to the incorrect qp context.<br>

>  > The patch switches to a local variable to save the wr_id.<br>

> <br>

> What subsequent event?  Is this due to connection requests coming in and<br>

> colliding with each other?  Or a connection request colliding with<br>

> posting receives from the completion processing context?<br>

> <br>

>  > Signed-off-by: Pradeep Satyanarayana <pradeep@us.ibm.com><br>

> <br>

> I still don't understand what Pradeep's sign-off is doing here... as<br>

> this email stands, basically what it is saying is that David wrote the<br>

> patch, then Pradeep sent it to David, and David is sending it to me.<br>

> Which is nonsensical.  Do you just mean that Pradeep was also involved<br>

> in writing the patch?<br>

</tt><br>

<tt>Actually the debug was from me, wr_id was messed up for each QP for nonSRQ IPoIB-CM. Nam pointed out the code problem in OFED-1.3, and Pradeep wrote the patch for OFED, Dave wrote the patch for mainline kernel. So I suggested them to be the co-author. Is that OK?</tt><br>

<tt><br>

>  > -   struct ib_recv_wr *bad_wr;<br>

>  > +   struct ib_recv_wr *bad_wr, rx_wr;<br>

>  > +   struct ib_sge      rx_sge[IPOIB_CM_RX_SG];<br>

> <br>

> I worry about this putting an extra 300 bytes or so on the stack... I<br>

> think it would be nicer if there were a way to just make sure receive<br>

> posting was single-threaded, but I still don't know which contexts are<br>

> racing against each other...<br>

</tt><br>

<tt>The race can happend is this condition, supposed we have 2 nodes, each node has one port, so we will have 1 active IPoIB-CM nonSRQ connections on each node. Each connection has its own receive queue, it posts its own receive buffer, then some how another node join the same IPoIB subnet, it will build another IPoIB-CM nonSRQ connection, and does post_recv for its own receiving queue, so ipoib_cm_post_receive_nonsrq() will be called between the new connection ipoib_cm_nonsrq_init_rx() and ipoib_cm_handle_rc_wc(). That's the race. It will cause the one QP's post_recv went to another queue pair then caused kernel random panic because the skb buffer messed up. It also happens if any active connection becomes stale and a new connection is requested. In general it's a race between new connection build up and handle_rx_wc() from poll_cq().</tt><br>

<br>

<tt>Thanks</tt><br>

<tt>Shirley</tt></body></html>