[openib-general] [PATCH/RFC] IB/mad: Fix race between cancel and receive

Jack Morgenstein jackm at dev.mellanox.co.il
Tue Nov 7 06:03:16 PST 2006


On Tuesday 07 November 2006 06:33, Roland Dreier wrote:
> I don't believe we should generate receive callbacks for canceled
> sends, so I came up with the patch below (much simpler than the
> explanation that led up to it).  I am no longer able to reproduce the
> IPoIB crash with this applied so I feel pretty good about this.
>
... 
> 
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 493f4c6..363db08 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -1804,7 +1804,7 @@ static void ib_mad_complete_recv(struct
>  	if (ib_response_mad(mad_recv_wc->recv_buf.mad)) {
>  		spin_lock_irqsave(&mad_agent_priv->lock, flags);
>  		mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
> -		if (!mad_send_wr) {
> +		if (!mad_send_wr || mad_send_wr->status != IB_WC_SUCCESS) {
>  			spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
>  			ib_free_recv_mad(mad_recv_wc);
>  			deref_mad_agent(mad_agent_priv);
> 

I think you're correct architecturally regarding generating receive callbacks for
cancelled sends.  You need to check, though, that the above change does not result
in memory leaks or broken logic.

For example, in ipoib_main.c:ipoib_flush_paths(), If there is an outstanding query,
ib_sa_cancel_query gets called.  The code then goes on to wait_for_completion() anyway
(assuming that even a cancelled query will result in a callback).

- Jack 




More information about the general mailing list