[ofa-general] madeye kernel oops

Hal Rosenstock halr at voltaire.com
Wed Mar 28 12:15:57 PDT 2007


On Wed, 2007-03-28 at 02:15, Ami Perlmutter wrote:
> On Tue, 2007-03-27 at 12:03 -0700, Sean Hefty wrote:
> > How easily can you reproduce this?  I'm assuming that this is with OFED 1.2 on
> > 2.6.20, correct?
> yes
> > Can you describe what you were doing when this crash occurred?
> opensm was running on the other computer
> running SDP programs

So the node which oops'd was only running madeye and some SDP data
transfer ?

Can you be more specific about the failure scenario ? What was going on
on the node which failed ? It looks like you were removing madeye. Was
this the first time ? Anything else going on ?

Thanks.

-- Hal

> > Thanks,
> > Sean
> > 
> > >Unable to handle kernel NULL pointer dereference at 0000000000000038
> > >RIP:
> > > [<ffffffff8801021f>] :ib_mad:ib_unregister_mad_agent+0x11/0x480
> > >PGD 73387067 PUD 72844067 PMD 0
> > >Oops: 0000 [1] SMP
> > >CPU 0
> > >Modules linked in: ib_madeye i2c_dev i2c_core ib_sdp rdma_cm iw_cm
> > >ib_addr ib_local_sa ib_uverbs ib_umad ib_mthca ib_ipoib ib_cm ib_sa
> > >ib_mad ib_core
> > >Pid: 8917, comm: rmmod Not tainted 2.6.20 #1
> > >RIP: 0010:[<ffffffff8801021f>]
> > >[<ffffffff8801021f>] :ib_mad:ib_unregister_mad_agent+0x11/0x480
> > >RSP: 0000:ffff810071ee1e08  EFLAGS: 00010292
> > >RAX: 0000000000000000 RBX: 0000000000000020 RCX: 000000000000003f
> > >RDX: ffff810077ebd6c0 RSI: 0000000000000202 RDI: 0000000000000000
> > >RBP: 0000000000000000 R08: ffff810077ebd728 R09: 0000000000000003
> > >R10: 0000000000000000 R11: 0000000000000000 R12: ffff8100766c33c0
> > >R13: 0000000000000002 R14: 0000000000000880 R15: 0000000000503010
> > >FS:  00002b3d6689fb00(0000) GS:ffffffff80702000(0000)
> > >knlGS:0000000000000000
> > >CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > >CR2: 0000000000000038 CR3: 0000000071086000 CR4: 00000000000006e0
> > >Process rmmod (pid: 8917, threadinfo ffff810071ee0000, task
> > >ffff8100781aeee0)
> > >Stack:  ffff810071ee1e18 ffffffff8022b92f ffff810071ee1e28
> > >ffffffff80538b43
> > > ffff810071ee1ea8 ffffffff80538ea2 ffffffff80690880 ffff810071ee1e78
> > > 000000000000000f 0000000000000020 0000000000000002 ffff8100766c33c0
> > >Call Trace:
> > > [<ffffffff8022b92f>] __cond_resched+0x1c/0x44
> > > [<ffffffff80538b43>] cond_resched+0x2e/0x39
> > > [<ffffffff80538ea2>] wait_for_completion+0x1a/0xd0
> > > [<ffffffff88093cf2>] :ib_madeye:madeye_remove_one+0x56/0x88
> > > [<ffffffff880041aa>] :ib_core:ib_unregister_client+0x40/0xe2
> > > [<ffffffff8024ae86>] sys_delete_module+0x1b4/0x1e5
> > > [<ffffffff80340065>] add_uevent_var+0x40/0xe3
> > > [<ffffffff8026613f>] sys_munmap+0x4b/0x58
> > > [<ffffffff8020959e>] system_call+0x7e/0x83
> > >
> > >
> > >Code: 83 7f 38 00 0f 84 fd 03 00 00 48 8d 44 24 20 4c 8d 67 f0 48
> > >RIP  [<ffffffff8801021f>] :ib_mad:ib_unregister_mad_agent+0x11/0x480
> > > RSP <ffff810071ee1e08>
> > >CR2: 0000000000000038
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list