[openib-general] [OOPS]: unloaded ib_mthca out from under opensm causes oops

Hal Rosenstock halr at voltaire.com
Fri Jun 3 13:26:48 PDT 2005


On Fri, 2005-06-03 at 15:39, Tom Duffy wrote: 
> I had opensm running on this node, before stopping opensm, unloaded
> ib_mthca, caused oops in kernel.
> 
> [root at flopteron2 ~]# rmmod ib_mthca
> [root at flopteron2 ~]# general protection fault: 0000 [1] SMP
> CPU 1
> Modules linked in: ib_ipoib ib_sa ib_umad ib_mad ib_core nfs lockd md5 ipv6 parport_pc lp parport autofs4 sunrpc pcmcia yenta_socket rsrc_nonstatic pcmcia_core ext3 jbd video container button battery ac uhci_hcd ehci_hcd hw_random i2c_i801 i2c_core e1000 floppy dm_snapshot dm_zero dm_mirror xfs exportfs dm_mod mptscsih mptbase sd_mod scsi_mod
> Pid: 8544, comm: opensm Not tainted 2.6.12-rc5openib
> RIP: 0010:[<ffffffff881199f0>] <ffffffff881199f0>{:ib_core:ib_create_ah+0}
> RSP: 0018:ffff8100319e5e70  EFLAGS: 00010246
> RAX: ffff81002fe97080 RBX: ffffffffffffffea RCX: ffff81001deb6088
> RDX: ffff8100021e6d20 RSI: ffff8100319e5ea8 RDI: 6b6b6b6b6b6b6b6b
> RBP: 0000000000000130 R08: 0000000000000000 R09: 0100000001018101
> R10: 7f12000000000000 R11: 0000000000000000 R12: ffff81001deb6040
> R13: 0000000000591780 R14: ffff81002e36d4b0 R15: 0000000000000100
> FS:  0000000043806960(0063) GS:ffffffff804e8100(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007fffff82b106 CR3: 000000002d408000 CR4: 00000000000006e0
> Process opensm (pid: 8544, threadinfo ffff8100319e4000, task ffff81003b81e1b0)
> Stack: ffffffff88131f3f 0000000000000000 000000000028195d 0000000000000360
>        ffff8100ffffffff ffff8100021e6d20 ffff81001deb6078 0000000000000000
>        0000000000000000 0000000000000000
> Call Trace:<ffffffff88131f3f>{:ib_umad:ib_umad_write+367} <ffffffff80183dda>{vfs_write+218}
>        <ffffffff801843b3>{sys_write+83} <ffffffff8010ea02>{system_call+126}
> 
> 
> Code: 48 8b 07 53 48 89 fb ff 90 20 01 00 00 48 3d 18 fc ff ff 48
> RIP <ffffffff881199f0>{:ib_core:ib_create_ah+0} RSP <ffff8100319e5e70>

user_mad (as well as ucm) needs to indicate error on any read/write file
operations related to a driver which has been removed. Maybe more than
this needs to be done on a driver removal with open fd(s).

-- Hal




More information about the general mailing list