[openib-general] kernel oops upon unloading ib_sa module

Jack Morgenstein jackm at mellanox.co.il
Sun Sep 18 08:41:43 PDT 2005


We loaded and unloaded the openib stack in a tight loop (shell script).
After unloading ib_ipoib, and while unloading ib_sa, we got the oops below.
It seems that ipoib did not clean itself up completely at module unload, and
a callback (to be invoked by ib_sa) was not properly cancelled.  Note that
the failure did not occur immediately -- but only after about 15-20
iterations of the script.
(openib svn version 3450)

Jack
============================================================================
==========================
Sep 19 01:57:10 swlab32 kernel: ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI
20 (level, low) -> IRQ 21
Sep 19 01:57:15 swlab32 kernel: Unable to handle kernel paging request at
virtual address f898f350
Sep 19 01:57:15 swlab32 kernel:  printing eip:
Sep 19 01:57:15 swlab32 kernel: f898f350
Sep 19 01:57:15 swlab32 kernel: *pde = 0183b067
Sep 19 01:57:15 swlab32 kernel: *pte = 00000000
Sep 19 01:57:15 swlab32 kernel: Oops: 0000 [#1]
Sep 19 01:57:15 swlab32 kernel: PREEMPT SMP
Sep 19 01:57:15 swlab32 kernel: Modules linked in: ib_sa ib_uverbs ib_umad
ib_mthca ib_mad ib_core
Sep 19 01:57:15 swlab32 kernel: CPU:    0
Sep 19 01:57:15 swlab32 kernel: EIP:    0060:[<f898f350>]    Not tainted VLI
Sep 19 01:57:15 swlab32 kernel: EFLAGS: 00010246   (2.6.13)
Sep 19 01:57:15 swlab32 kernel: EIP is at 0xf898f350
Sep 19 01:57:15 swlab32 kernel: eax: 00000000   ebx: 00000286   ecx:
f897a9d0   edx: 00000001
Sep 19 01:57:15 swlab32 kernel: esi: e1b222a0   edi: e1b222a8   ebp:
fffffffc   esp: d8d3ddfc
Sep 19 01:57:15 swlab32 kernel: ds: 007b   es: 007b   ss: 0068
Sep 19 01:57:15 swlab32 ifdown: Interface not available and no configuration
found.
Sep 19 01:57:17 swlab32 ifdown: Interface not available and no configuration
found.
Sep 19 01:57:23 swlab32 kernel: Process modprobe (pid: 19620,
threadinfo=d8d3c000 task=f62f0a20)
Sep 19 01:57:23 swlab32 kernel: Stack: f897aa53 fffffffc 00000000 da38f180
f8999bb3 d97da000 00000000 00000000
Sep 19 01:57:23 swlab32 kernel:        d8d3de38 00000000 00000026 00000001
0000000f 00003a98 d8d3de93 00000000
Sep 19 01:57:23 swlab32 kernel:        00000000 d97da000 f416c8c0 f4554c00
d948c980 00000286 e1b222a8 d8d3de98
Sep 19 01:57:23 swlab32 ifdown: Interface not available and no configuration
found.
Sep 19 01:57:23 swlab32 kernel: Call Trace:
Sep 19 01:57:24 swlab32 kernel:  [<f897aa53>]
ib_sa_mcmember_rec_callback+0x83/0xa0 [ib_sa]
Sep 19 01:57:24 swlab32 kernel:  [<f8999bb3>] mthca_cmd_box+0x83/0xf0
[ib_mthca]
Sep 19 01:57:24 swlab32 kernel:  [<f897ad04>] send_handler+0xd4/0x110
[ib_sa]
Sep 19 01:57:24 swlab32 kernel:  [<f896dfe0>] cancel_mads+0x130/0x180
[ib_mad]
Sep 19 01:57:24 swlab32 kernel:  [<f896b823>]
unregister_mad_agent+0x13/0x150 [ib_mad]
Sep 19 01:57:24 swlab32 kernel:  [<c042148f>]
_spin_unlock_irqrestore+0xf/0x30
Sep 19 01:57:24 swlab32 kernel:  [<f897a000>] free_sm_ah+0x0/0x30 [ib_sa]
Sep 19 01:57:24 swlab32 kernel:  [<f89a7a4e>] mthca_ah_destroy+0x1e/0x30
[ib_mthca]
Sep 19 01:57:24 swlab32 kernel:  [<f8960b86>] ib_destroy_ah+0x16/0x30
[ib_core]
Sep 19 01:57:24 swlab32 kernel:  [<f897a000>] free_sm_ah+0x0/0x30 [ib_sa]
Sep 19 01:57:24 swlab32 kernel:  [<c021d925>] kref_put+0x45/0xc0
Sep 19 01:57:24 swlab32 kernel:  [<f896ba49>]
ib_unregister_mad_agent+0x19/0x30 [ib_mad]
Sep 19 01:57:24 swlab32 kernel:  [<f897b02c>] ib_sa_remove_one+0x6c/0xa0
[ib_sa]
Sep 19 01:57:24 swlab32 kernel:  [<f897a000>] free_sm_ah+0x0/0x30 [ib_sa]
Sep 19 01:57:24 swlab32 kernel:  [<f8962b67>] ib_unregister_client+0xc7/0xf0
[ib_core]
Sep 19 01:57:24 swlab32 kernel:  [<c013b398>] try_stop_module+0x38/0x40
Sep 19 01:57:24 swlab32 kernel:  [<c013b310>] __try_stop_module+0x0/0x50
Sep 19 01:57:24 swlab32 kernel:  [<f897b06f>] ib_sa_cleanup+0xf/0x11 [ib_sa]
Sep 19 01:57:24 swlab32 kernel:  [<c013b5ca>] sys_delete_module+0x19a/0x1b0
Sep 19 01:57:24 swlab32 kernel:  [<c0150061>] get_init_ra_size+0x41/0x90
Sep 19 01:57:24 swlab32 kernel:  [<c015c991>] sys_munmap+0x51/0x80
Sep 19 01:57:24 swlab32 kernel:  [<c0103245>] syscall_call+0x7/0xb
Sep 19 01:57:24 swlab32 kernel: Code:  Bad EIP value.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050918/fa646b23/attachment.html>


More information about the general mailing list