[openib-general] slab error in kmem_cache_destroy(): cache `ib_mad': Can't free all objects

Michael S. Tsirkin mst at mellanox.co.il
Mon May 2 01:40:13 PDT 2005


Hi!
I have this script to unload all modules:

killall opensm
sleep 3
killall -9 opensm
modprobe -r ib_ipoib
modprobe -r ib_umad
modprobe -r ib_mthca

So, I try to unload the modules while opensm may still be dying.
Every now and then I see this crash (below), which seems to indicate
a race condition or leak somewhere around ib_mad or ib_umad.
My guess is a mad may still be outstanding.

Ideas, anyone?

Further, I'm looking at the mad agent logic and it seems a bit weird that
./core/agent.c does kmem_cache_free in agent_send_handler,
while all allocs are in mad.c. What prevents an agent from deregistering
while a send is outstanding? What'll free the mad_priv then?

log dump below.

Thanks,

MST

This is with 2.6.11 + rev 2235 (latest bits as of now), x86_64 (Intel Nocona).


May  2 10:36:12 swlab156 kernel: slab error in kmem_cache_destroy(): cache `ib_mad': Can't free all objects
May  2 10:36:12 swlab156 kernel: 
May  2 10:36:12 swlab156 kernel: Call Trace:<ffffffff801592af>{kmem_cache_destroy+184} <ffffffff88010714>{:ib_mad:ib_mad_cleanup_module+28} 
May  2 10:36:12 swlab156 kernel:        <ffffffff8014c044>{sys_delete_module+487} <ffffffff8022991c>{__up_write+28} 
May  2 10:36:12 swlab156 kernel:        <ffffffff80162952>{sys_munmap+74} <ffffffff8010e0d2>{system_call+126} 
May  2 10:36:12 swlab156 kernel:        
May  2 10:36:12 swlab156 kernel: ib_mad: Failed to destroy ib_mad cache


Any attempt to load ib_mad after that fails:


May  2 10:36:25 swlab156 kernel: kmem_cache_create: duplicate cache ib_mad
May  2 10:36:25 swlab156 kernel: ----------- [cut here ] --------- [please bite here ] ---------
May  2 10:36:25 swlab156 kernel: Kernel BUG at slab:1472
May  2 10:36:25 swlab156 kernel: invalid operand: 0000 [1] SMP 
May  2 10:36:25 swlab156 kernel: CPU 1 
May  2 10:36:25 swlab156 kernel: Modules linked in: ib_mad ib_core
May  2 10:36:25 swlab156 kernel: Pid: 14102, comm: modprobe Not tainted 2.6.11-openib
May  2 10:36:25 swlab156 kernel: RIP: 0010:[kmem_cache_create+1384/1539] <ffffffff801598b6>{kmem_cache_create+1384}
May  2 10:36:25 swlab156 kernel: RIP: 0010:[<ffffffff801598b6>] <ffffffff801598b6>{kmem_cache_create+1384}
May  2 10:36:25 swlab156 kernel: RSP: 0018:ffff81015d8c7ee8  EFLAGS: 00010202
May  2 10:36:25 swlab156 kernel: RAX: 000000000000002a RBX: ffff81015fd69670 RCX: ffffffff804572a8
May  2 10:36:25 swlab156 kernel: RDX: ffffffff804572a8 RSI: 0000000000000296 RDI: ffffffff8055f0c0
May  2 10:36:25 swlab156 kernel: RBP: ffff81015fd69480 R08: ffff81015e0976c0 R09: 0000000000000000
May  2 10:36:25 swlab156 kernel: R10: 0000000000000000 R11: 0000000000000080 R12: ffffffff8055f0c0
May  2 10:36:25 swlab156 kernel: R13: 0000000000002000 R14: ffff810000000000 R15: 0000000000000080
May  2 10:36:25 swlab156 kernel: FS:  00002aaaaade26e0(0000) GS:ffffffff80583180(0000) knlGS:0000000000000000
May  2 10:36:25 swlab156 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  2 10:36:25 swlab156 kernel: CR2: 00002aaaaaacc000 CR3: 000000013f67a000 CR4: 00000000000006e0
May  2 10:36:25 swlab156 kernel: Process modprobe (pid: 14102, threadinfo ffff81015d8c6000, task ffff81015dcc57f0)
May  2 10:36:25 swlab156 kernel: Stack: ffffffffffffff80 0000000000000000 0000000000000000 ffffffff88010951 
May  2 10:36:25 swlab156 kernel:        0000000000000180 ffffffff8045a000 ffffffff88013000 ffffffff80459fc0 
May  2 10:36:25 swlab156 kernel:        ffffffff80459fc0 00007ffffffff408 
May  2 10:36:25 swlab156 kernel: Call Trace:<ffffffff88015033>{:ib_mad:ib_mad_init_module+51} <ffffffff8014ba19>{sys_init_module+298} 
May  2 10:36:25 swlab156 kernel:        <ffffffff8010e0d2>{system_call+126} 
May  2 10:36:25 swlab156 kernel: 
May  2 10:36:25 swlab156 kernel: Code: 0f 0b e5 be 3e 80 ff ff ff ff c0 05 48 8b 1b 48 8b 03 0f 18 
May  2 10:36:25 swlab156 kernel: RIP <ffffffff801598b6>{kmem_cache_create+1384} RSP <ffff81015d8c7ee8>

-- 
MST - Michael S. Tsirkin



More information about the general mailing list