[ofa-general] CM sysfs-related oops on device driver reload

Roland Dreier rdreier at cisco.com
Thu Feb 7 22:32:11 PST 2008


It seems there's something wrong with the new CM sysfs stuff for
exporting statistics.  If I load ib_cm and ib_mthca, and then rmmod
ib_mthca and then reload it, I get the trace below.

I'm partly to blame here, since I rejiggered the CM code as part of
the 2.6.25, but I think I did make things a little better ;) -- mostly
I got the code to build, but I also tried to fix the object lifetime a
bit.  It still is suspicious that cm_counter_obj_type has no .release
method.

Sean, I'm just letting you know about this in case you have a chance
to look at it -- I probably won't have time to work on it until next week.

 - R.

WARNING: at /scratch/Ksrc/linux-git/lib/kref.c:43 kref_get+0x1a/0x1f()
Modules linked in: ib_mthca(+) rdma_ucm rdma_cm iw_cm ib_addr ib_uverbs ib_ipoib ib_cm ib]
Pid: 2713, comm: modprobe Not tainted 2.6.24-dbg #8

Call Trace:
 [<ffffffff8022e071>] warn_on_slowpath+0x51/0x63
 [<ffffffff8030087c>] kvasprintf+0x44/0x6b
 [<ffffffff80280b2d>] poison_obj+0x26/0x2f
 [<ffffffff802fec4a>] vsnprintf+0x30f/0x571
 [<ffffffff80280c1b>] cache_alloc_debugcheck_after+0xe5/0x11e
 [<ffffffff80300895>] kvasprintf+0x5d/0x6b
 [<ffffffff802fb7e4>] kref_get+0x1a/0x1f
 [<ffffffff802faa5f>] kobject_get+0x12/0x17
 [<ffffffff802fab1a>] kobject_add_internal+0x4c/0x177
 [<ffffffff802fad1f>] kobject_add_varg+0x54/0x61
 [<ffffffff8024b031>] trace_hardirqs_on+0xef/0x113
 [<ffffffff802fa8a1>] kobject_init+0x42/0x82
 [<ffffffff802fad87>] kobject_init_and_add+0x5b/0x68
 [<ffffffff8024b031>] trace_hardirqs_on+0xef/0x113
 [<ffffffff880b78da>] :ib_mthca:mthca_query_device+0x24e/0x25e
 [<ffffffff881e4238>] :ib_cm:cm_add_one+0xd7/0x335
 [<ffffffff8024b031>] trace_hardirqs_on+0xef/0x113
 [<ffffffff88051340>] :ib_core:ib_register_device+0x3b8/0x3f1
 [<ffffffff80249508>] static_obj+0x5d/0x74
 [<ffffffff8024962d>] lockdep_init_map+0x81/0x3d2
 [<ffffffff880b6796>] :ib_mthca:mthca_register_device+0x3f9/0x44b
 [<ffffffff880a8f85>] :ib_mthca:__mthca_init_one+0x629/0x714
 [<ffffffff8042f1b6>] mutex_lock_nested+0x230/0x23f
 [<ffffffff880ba3fb>] :ib_mthca:mthca_init_one+0x7a/0x8e
 [<ffffffff80307ae2>] pci_device_probe+0xb3/0xfb
 [<ffffffff803698ba>] driver_probe_device+0xb5/0x132
 [<ffffffff80369a70>] __driver_attach+0x86/0xc3
 [<ffffffff803699ea>] __driver_attach+0x0/0xc3
 [<ffffffff803699ea>] __driver_attach+0x0/0xc3
 [<ffffffff80368c24>] bus_for_each_dev+0x47/0x72
 [<ffffffff803694db>] bus_add_driver+0xb1/0x1fa
 [<ffffffff80369ccd>] driver_register+0x59/0xce
 [<ffffffff80307d59>] __pci_register_driver+0x5a/0x8d
 [<ffffffff8800c1a3>] :ib_mthca:mthca_init+0x141/0x155
 [<ffffffff80252f14>] sys_init_module+0x18db/0x19e4
 [<ffffffff8027c27b>] alloc_pages_current+0x0/0x78
 [<ffffffff8024b031>] trace_hardirqs_on+0xef/0x113
 [<ffffffff8042fff7>] trace_hardirqs_on_thunk+0x35/0x3a
 [<ffffffff8020be5b>] system_call_after_swapgs+0x7b/0x80

---[ end trace 39d6f4ee281e2f49 ]---
BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: [<ffffffff802c699e>] sysfs_addrm_start+0x2f/0x9f
PGD 22f130067 PUD 227cc5067 PMD 0
Oops: 0000 [1] SMP
CPU 2
Modules linked in: ib_mthca(+) rdma_ucm rdma_cm iw_cm ib_addr ib_uverbs ib_ipoib ib_cm ib]
Pid: 2713, comm: modprobe Not tainted 2.6.24-dbg #8
RIP: 0010:[<ffffffff802c699e>]  [<ffffffff802c699e>] sysfs_addrm_start+0x2f/0x9f
RSP: 0018:ffff81022b1a37b8  EFLAGS: 00010292
RAX: ffff81022b1a3748 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000022 RDI: ffff81022f03a248
RBP: ffff81022b1a37d8 R08: 0000000000000000 R09: ffff81022b1a3748
R10: 0000000000000000 R11: ffffffff802490c5 R12: 00000000fffffff4
R13: 0000000000000000 R14: ffff81022b1a3830 R15: ffff810227d83000
FS:  00007f511e1c16e0(0000) GS:ffff81022f07c300(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000038 CR3: 0000000229540000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 2713, threadinfo ffff81022b1a2000, task ffff8102260ac080)
Stack:  0000000000000000 ffff810228dd20f0 ffff810229ce07b8 ffffffff802c6ee8
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 ffff810228dd20f0 ffff810228dd20f0 00000000fffffffe ffffffff881e9200
Call Trace:
 [<ffffffff802c6ee8>] ? create_dir+0x44/0x87
 [<ffffffff802c6f60>] ? sysfs_create_dir+0x35/0x4c
 [<ffffffff802faa5f>] ? kobject_get+0x12/0x17
 [<ffffffff802fab8a>] ? kobject_add_internal+0xbc/0x177
 [<ffffffff802fad1f>] ? kobject_add_varg+0x54/0x61
 [<ffffffff8024b031>] ? trace_hardirqs_on+0xef/0x113
 [<ffffffff802fa8a1>] ? kobject_init+0x42/0x82
 [<ffffffff802fad87>] ? kobject_init_and_add+0x5b/0x68
 [<ffffffff8024b031>] ? trace_hardirqs_on+0xef/0x113
 [<ffffffff880b78da>] ? :ib_mthca:mthca_query_device+0x24e/0x25e
 [<ffffffff881e4238>] ? :ib_cm:cm_add_one+0xd7/0x335
 [<ffffffff8024b031>] ? trace_hardirqs_on+0xef/0x113
 [<ffffffff88051340>] ? :ib_core:ib_register_device+0x3b8/0x3f1
 [<ffffffff80249508>] ? static_obj+0x5d/0x74
 [<ffffffff8024962d>] ? lockdep_init_map+0x81/0x3d2
 [<ffffffff880b6796>] ? :ib_mthca:mthca_register_device+0x3f9/0x44b
 [<ffffffff880a8f85>] ? :ib_mthca:__mthca_init_one+0x629/0x714
 [<ffffffff8042f1b6>] ? mutex_lock_nested+0x230/0x23f
 [<ffffffff880ba3fb>] ? :ib_mthca:mthca_init_one+0x7a/0x8e
 [<ffffffff80307ae2>] ? pci_device_probe+0xb3/0xfb
 [<ffffffff803698ba>] ? driver_probe_device+0xb5/0x132
 [<ffffffff80369a70>] ? __driver_attach+0x86/0xc3
 [<ffffffff803699ea>] ? __driver_attach+0x0/0xc3
 [<ffffffff803699ea>] ? __driver_attach+0x0/0xc3
 [<ffffffff80368c24>] ? bus_for_each_dev+0x47/0x72
 [<ffffffff803694db>] ? bus_add_driver+0xb1/0x1fa
 [<ffffffff80369ccd>] ? driver_register+0x59/0xce
 [<ffffffff80307d59>] ? __pci_register_driver+0x5a/0x8d
 [<ffffffff8800c1a3>] ? :ib_mthca:mthca_init+0x141/0x155
 [<ffffffff80252f14>] ? sys_init_module+0x18db/0x19e4
 [<ffffffff8027c27b>] ? alloc_pages_current+0x0/0x78
 [<ffffffff8024b031>] ? trace_hardirqs_on+0xef/0x113
 [<ffffffff8042fff7>] ? trace_hardirqs_on_thunk+0x35/0x3a
 [<ffffffff8020be5b>] ? system_call_after_swapgs+0x7b/0x80


Code: 08 00 00 00 fc 53 48 89 fd 48 89 f3 48 83 ec 08 f3 ab 48 89 75 00 48 c7 c7 90 8a 57
RIP  [<ffffffff802c699e>] sysfs_addrm_start+0x2f/0x9f
 RSP <ffff81022b1a37b8>
CR2: 0000000000000038



More information about the general mailing list