[ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5

David Dillow dillowda at ornl.gov
Thu Dec 27 09:53:38 PST 2007


On Thu, 2007-12-27 at 11:58 +0900, FUJITA Tomonori wrote:
> On Wed, 26 Dec 2007 12:14:11 -0500
> David Dillow <dillowda at ornl.gov> wrote:
> 
> > 
> > On Sun, 2007-12-23 at 01:41 +0900, FUJITA Tomonori wrote:
> > > transport_container_unregister(&i->rport_attr_cont) should not fail here.
> > > 
> > > It fails because there is still a srp rport.
> > > 
> > > I think that as Pete pointed out, srp_remove_one needs to call
> > > srp_remove_host.
> > > 
> > > Can you try this?
> > 
> > That patched oopsed in scsi_remove_host(), but reversing the order has
> > survived over 500 insert/probe/remove cycles.
> 
> Thanks,
> 
> Can you post the oops message? The srp class might have bugs related
> to it.

This is the oops generated by doing srp_remove_host() prior to
scsi_remove_host() in 2.6.24-rc5:

Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: 
 [<ffffffff811d058d>] klist_del+0xa/0x46
PGD 8450d8067 PUD 843cbd067 PMD 0 
Oops: 0000 [1] SMP 
CPU 3 
Modules linked in: sg sd_mod ib_iser libiscsi scsi_transport_iscsi rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp scsi_mod ib_cm ib_ipoib ib_sa ib_uverbs ib_umad ib_mthca ib_mad ib_core ehci_hcd ohci_hcd nfs lockd nfs_acl sunrpc unionfs forcedeth
Pid: 2450, comm: rmmod Not tainted 2.6.24-rc5 #2
RIP: 0010:[<ffffffff811d058d>]  [<ffffffff811d058d>] klist_del+0xa/0x46
RSP: 0018:ffff81084192bd28  EFLAGS: 00010282
RAX: ffff81084600b000 RBX: 0000000000000000 RCX: ffffe2001ce562c8
RDX: 0000000000000000 RSI: ffff810447c1d000 RDI: ffff81084657f050
RBP: ffff81084657f028 R08: ffff810447c1d000 R09: ffff8108455a1800
R10: ffff8108455a1800 R11: ffff810846730808 R12: ffff81084657f050
R13: ffff810844c4a170 R14: ffff81084657f028 R15: 0000000000000880
FS:  00002afbf1b0b6e0(0000) GS:ffff810846531840(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 0000000843c56000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 2450, threadinfo ffff81084192a000, task ffff810844d47620)
Stack:  ffff810844c4a000 ffff81084657f028 ffff81084657f000 ffffffff8114cbd6
 ffff810846730808 ffff810844c4a000 ffff81084657f028 ffff81084657f000
 0000000000000246 ffffffff88118322 ffff8108455a1800 ffff81084657f000
Call Trace:
 [<ffffffff8114cbd6>] device_del+0x20/0x2f0
 [<ffffffff88118322>] :scsi_mod:scsi_target_reap_usercontext+0x53/0xbd
 [<ffffffff810455ce>] execute_in_process_context+0x20/0x47
 [<ffffffff8811a4da>] :scsi_mod:scsi_device_dev_release_usercontext+0xd3/0x105
 [<ffffffff810455ce>] execute_in_process_context+0x20/0x47
 [<ffffffff810ed9b8>] kobject_cleanup+0x2f/0x51
 [<ffffffff810ed9da>] kobject_release+0x0/0x9
 [<ffffffff810ee692>] kref_put+0x74/0x82
 [<ffffffff88119f02>] :scsi_mod:scsi_forget_host+0x53/0x55
 [<ffffffff88112018>] :scsi_mod:scsi_remove_host+0x76/0xf7
 [<ffffffff8813d161>] :ib_srp:srp_remove_one+0x102/0x19d
 [<ffffffff880ac2bc>] :ib_core:ib_unregister_client+0x40/0xb3
 [<ffffffff8813d20a>] :ib_srp:srp_cleanup_module+0xe/0x34
 [<ffffffff810551f1>] sys_delete_module+0x18d/0x1bc
 [<ffffffff811d3879>] error_exit+0x0/0x51
 [<ffffffff8100be6e>] system_call+0x7e/0x83


Code: 48 8b 6b 20 48 89 df e8 b7 2f 00 00 4c 89 e7 e8 d2 ff ff ff 
RIP  [<ffffffff811d058d>] klist_del+0xa/0x46
 RSP <ffff81084192bd28>
CR2: 0000000000000020





More information about the general mailing list