[openib-general] [PATCH/RFC] IB/mad: Fix race between cancel and receive
Roland Dreier
rdreier at cisco.com
Mon Nov 6 20:33:25 PST 2006
I've been working on fixing http://openib.org/bugzilla/show_bug.cgi
which is an IPoIB crash. I figured out that the problem was a
use-after-free caused by two path record query completions being
generated, one with a successful status and one with a status of
-EINTR. In turn, this was caused by the sa_query module getting both
a successful receive completion and a send completion with status
IB_WC_WR_FLUSH_ERR.
I think I understand why this happens: in mad.c, ib_cancel_mad() calls
ib_modify_mad(), which ends up setting the timeout to 0 and
mad_send_wr->status to IB_WC_WR_FLUSH_ERR, and then calling
ib_reset_mad_timeout() to actually flush the MAD send. However, this
doesn't really flush the MAD immediately -- ib_reset_mad_timeout()
defers it to process context via wait_for_response(), which schedules
timed_work to run.
However, this leaves a window where a receive completion could be
polled before timeout_sends() gets a chance to run. This gives
ib_mad_complete_recv() a chance to grab the send request and generate
receive and send callbacks for it. The function does:
mad_send_wc.status = IB_WC_SUCCESS;
mad_send_wc.vendor_err = 0;
mad_send_wc.send_buf = &mad_send_wr->send_buf;
ib_mad_complete_send_wr(mad_send_wr, &mad_send_wc);
so it tries to set mad_send_wc.status to IB_WC_SUCCESS, but
ib_mad_complete_send_wr() has:
if (mad_send_wr->status != IB_WC_SUCCESS )
mad_send_wc->status = mad_send_wr->status;
so the status field will be set to IB_WC_WR_FLUSH_ERR.
I don't believe we should generate receive callbacks for canceled
sends, so I came up with the patch below (much simpler than the
explanation that led up to it). I am no longer able to reproduce the
IPoIB crash with this applied so I feel pretty good about this.
Sean/Hal, does this make sense? If this looks good, I will merge this
for 2.6.19.
Thanks...
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 493f4c6..363db08 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1804,7 +1804,7 @@ static void ib_mad_complete_recv(struct
if (ib_response_mad(mad_recv_wc->recv_buf.mad)) {
spin_lock_irqsave(&mad_agent_priv->lock, flags);
mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
- if (!mad_send_wr) {
+ if (!mad_send_wr || mad_send_wr->status != IB_WC_SUCCESS) {
spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
ib_free_recv_mad(mad_recv_wc);
deref_mad_agent(mad_agent_priv);
More information about the general
mailing list