[ofa-general] Possible memory leak and null pointer dereference in local_completions()
Ralph Campbell
ralph.campbell at qlogic.com
Wed Feb 4 11:58:05 PST 2009
On Wed, 2009-02-04 at 04:29 -0800, Hal Rosenstock wrote:
> On Tue, Feb 3, 2009 at 2:26 PM, Ralph Campbell
> <ralph.campbell at qlogic.com> wrote:
> > I was doing some tests with different MAD packets and
> > then reading the infiniband/core/mad.c code.
> >
> > handle_outgoing_dr_smp() can queue a struct ib_mad_local_private *local
> > on the mad_agent_priv->local_work work queue with
> > local->mad_priv == NULL if device->process_mad() returns
> > IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY and
> > (!ib_response_mad(&mad_priv->mad.mad) ||
> > !mad_agent_priv->agent.recv_handler).
> >
> > In this case, local_completions() will be called with
> > local->mad_priv == NULL. The code does check for this
> > case and skips calling recv_mad_agent->agent.recv_handler().
> > This means recv == 0 so kmem_cache_free() is called with a
> > NULL pointer.
>
> That could be fixed by changing the check for !recv prior to the
> kmem_cache_free there to a check for (!recv && local->mad_priv).
This is what we did to continue making progress so I know
it works.
> > Even if local->mad_priv != NULL, I don't see how local->mad_priv
> > is freed when recv == 1. Thus, it appears to be a memory leak.
>
> For those cases, it's either freed in local_completions (as recv is
> set to 1 for local->mad_priv != NULL except when there is no mad recv
> agent but that is another bug (see below)) or earlier in the else
> clause of the IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY of
> handle_outgoing_dr_smp(). That's another issue that this points out
> where recv = 1 needs to be moved up in local_completions.
The other problem I noticed with setting recv = 1, is that recv = 0
is outside the while (!list_empty) loop so it is never reset back
to zero.
I'm not really following you about recv = 1 needs to be moved up in
local_completions.
What I was really looking for was a confirmation that the original
code had a memory leak. I don't see any reason to special case the
call to kmem_cache_free(). It seems to me that it is needed any time
local->mad_priv != NULL.
The NULL pointer bug is easily fixed in a number of different ways.
> Would you try the untested patch below and see if it fixes the problem
> you found ? Thanks.
We are in the middle of moving our office so I won't be able to
reproduce this until next week.
> -- Hal
>
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 5c54fc2..cca87e6 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -2371,13 +2371,13 @@ static void local_completions(struct work_struct *work)
> list_del(&local->completion_list);
> spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
> if (local->mad_priv) {
> + recv = 1;
> recv_mad_agent = local->recv_mad_agent;
> if (!recv_mad_agent) {
> printk(KERN_ERR PFX "No receive MAD agent for lo
> goto local_send_completion;
> }
>
> - recv = 1;
> /*
> * Defined behavior is to complete response
> * before request
> @@ -2422,7 +2422,7 @@ local_send_completion:
>
> spin_lock_irqsave(&mad_agent_priv->lock, flags);
> atomic_dec(&mad_agent_priv->refcount);
> - if (!recv)
> + if (!recv && local->mad_priv)
> kmem_cache_free(ib_mad_cache, local->mad_priv);
> kfree(local);
> }
>
> > So, I'm proposing the following patch:
> >
> > diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> > index 5c54fc2..93d80e5 100644
> > --- a/drivers/infiniband/core/mad.c
> > +++ b/drivers/infiniband/core/mad.c
> > @@ -2356,7 +2356,6 @@ static void local_completions(struct work_struct *work)
> > struct ib_mad_local_private *local;
> > struct ib_mad_agent_private *recv_mad_agent;
> > unsigned long flags;
> > - int recv = 0;
> > struct ib_wc wc;
> > struct ib_mad_send_wc mad_send_wc;
> >
> > @@ -2377,7 +2376,6 @@ static void local_completions(struct work_struct *work)
> > goto local_send_completion;
> > }
> >
> > - recv = 1;
> > /*
> > * Defined behavior is to complete response
> > * before request
> > @@ -2422,7 +2420,7 @@ local_send_completion:
> >
> > spin_lock_irqsave(&mad_agent_priv->lock, flags);
> > atomic_dec(&mad_agent_priv->refcount);
> > - if (!recv)
> > + if (local->mad_priv)
> > kmem_cache_free(ib_mad_cache, local->mad_priv);
> > kfree(local);
> > }
> >
> >
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >
> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> >
More information about the general
mailing list