[ofa-general] Re: lockdep question (was Re: IPoIB caused a kernel: BUG: softlockup detected on CPU#0!)
Michael S. Tsirkin
mst at mellanox.co.il
Mon Mar 12 07:20:13 PDT 2007
> Quoting Ingo Molnar <mingo at elte.hu>:
> Subject: Re: lockdep question (was Re: IPoIB caused a kernel: BUG: softlockup detected on CPU#0!)
>
>
> * Michael S. Tsirkin <mst at mellanox.co.il> wrote:
>
> > > could you turn on CONFIG_SLAB_DEBUG as well?
> > >
> > > that should catch certain types of use-after-free accesses, and
> > > lockdep will also warn if a still locked object is freed.
> >
> > Hmm, no, this does not look like use-after-free. I enabled
> > CONFIG_SLAB_DEBUG, and I still see the same message, so the memory was
> > not overwritten by slab debugger.
>
> that's still not conclusive - the memory might not have been allocated
> by slab again to detect it. Your magic-number check definitely shows
> some sort of corruption going on, right?
Not necessarily in such a direct way.
I currently think we are somehow getting neighbours where
neigh->dev points to a loopback device - that's type 772,
and this seems to make sense.
I printed out the device name and sure enough it is "lo".
Is it true that sticking the following
static int ipoib_neigh_setup_dev(struct net_device *dev,
struct neigh_parms *parms)
{
parms->neigh_destructor = ipoib_neigh_destructor;
return 0;
}
in dev->neigh_setup, as ipoib does, guarantees that neighbour->dev will point to
the current device for any neighbour which ipoib_neigh_destructor gets?
That's the assumption IPoIB makes, and it seems broken in this instance.
How could that be?
--
MST
More information about the general
mailing list