[ofa-general] Re: [PATCH] - [resend] Corrects a race in ipoib_cm_post_receive_nonsrq()

Roland Dreier rdreier at cisco.com
Wed Jun 25 18:44:21 PDT 2008


 > Corrects a race condition in ipoib_cm_post_receive_nonsrq()
 > which allows wqes from one QP context to be post_recv
 > to another QP context.  The ipoib_cm_post_receive_nonsrq()
 > saves the wr_id in the shared structure ipoib_cm_dev_priv 
 > making it possible for the saved wr_id to be overwritten by
 > a subsequent event and posting to the incorrect qp context.
 > The patch switches to a local variable to save the wr_id.

What subsequent event?  Is this due to connection requests coming in and
colliding with each other?  Or a connection request colliding with
posting receives from the completion processing context?

 > Signed-off-by: Pradeep Satyanarayana <pradeep at us.ibm.com>

I still don't understand what Pradeep's sign-off is doing here... as
this email stands, basically what it is saying is that David wrote the
patch, then Pradeep sent it to David, and David is sending it to me.
Which is nonsensical.  Do you just mean that Pradeep was also involved
in writing the patch?

 > -	struct ib_recv_wr *bad_wr;
 > +	struct ib_recv_wr *bad_wr, rx_wr;
 > +	struct ib_sge	   rx_sge[IPOIB_CM_RX_SG];

I worry about this putting an extra 300 bytes or so on the stack... I
think it would be nicer if there were a way to just make sure receive
posting was single-threaded, but I still don't know which contexts are
racing against each other...



More information about the general mailing list