[ofa-general] Possible memory leak and null pointer dereference in local_completions()
Hal Rosenstock
hal.rosenstock at gmail.com
Wed Feb 4 04:29:36 PST 2009
On Tue, Feb 3, 2009 at 2:26 PM, Ralph Campbell
<ralph.campbell at qlogic.com> wrote:
> I was doing some tests with different MAD packets and
> then reading the infiniband/core/mad.c code.
>
> handle_outgoing_dr_smp() can queue a struct ib_mad_local_private *local
> on the mad_agent_priv->local_work work queue with
> local->mad_priv == NULL if device->process_mad() returns
> IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY and
> (!ib_response_mad(&mad_priv->mad.mad) ||
> !mad_agent_priv->agent.recv_handler).
>
> In this case, local_completions() will be called with
> local->mad_priv == NULL. The code does check for this
> case and skips calling recv_mad_agent->agent.recv_handler().
> This means recv == 0 so kmem_cache_free() is called with a
> NULL pointer.
That could be fixed by changing the check for !recv prior to the
kmem_cache_free there to a check for (!recv && local->mad_priv).
> Even if local->mad_priv != NULL, I don't see how local->mad_priv
> is freed when recv == 1. Thus, it appears to be a memory leak.
For those cases, it's either freed in local_completions (as recv is
set to 1 for local->mad_priv != NULL except when there is no mad recv
agent but that is another bug (see below)) or earlier in the else
clause of the IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY of
handle_outgoing_dr_smp(). That's another issue that this points out
where recv = 1 needs to be moved up in local_completions.
Would you try the untested patch below and see if it fixes the problem
you found ? Thanks.
-- Hal
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 5c54fc2..cca87e6 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2371,13 +2371,13 @@ static void local_completions(struct work_struct *work)
list_del(&local->completion_list);
spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
if (local->mad_priv) {
+ recv = 1;
recv_mad_agent = local->recv_mad_agent;
if (!recv_mad_agent) {
printk(KERN_ERR PFX "No receive MAD agent for lo
goto local_send_completion;
}
- recv = 1;
/*
* Defined behavior is to complete response
* before request
@@ -2422,7 +2422,7 @@ local_send_completion:
spin_lock_irqsave(&mad_agent_priv->lock, flags);
atomic_dec(&mad_agent_priv->refcount);
- if (!recv)
+ if (!recv && local->mad_priv)
kmem_cache_free(ib_mad_cache, local->mad_priv);
kfree(local);
}
> So, I'm proposing the following patch:
>
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 5c54fc2..93d80e5 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -2356,7 +2356,6 @@ static void local_completions(struct work_struct *work)
> struct ib_mad_local_private *local;
> struct ib_mad_agent_private *recv_mad_agent;
> unsigned long flags;
> - int recv = 0;
> struct ib_wc wc;
> struct ib_mad_send_wc mad_send_wc;
>
> @@ -2377,7 +2376,6 @@ static void local_completions(struct work_struct *work)
> goto local_send_completion;
> }
>
> - recv = 1;
> /*
> * Defined behavior is to complete response
> * before request
> @@ -2422,7 +2420,7 @@ local_send_completion:
>
> spin_lock_irqsave(&mad_agent_priv->lock, flags);
> atomic_dec(&mad_agent_priv->refcount);
> - if (!recv)
> + if (local->mad_priv)
> kmem_cache_free(ib_mad_cache, local->mad_priv);
> kfree(local);
> }
>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list