[ewg] Kernel Panic on OFED-1.5.1-RC2 on Intel iWarp card

Tung, Chien Tin chien.tin.tung at intel.com
Mon Mar 1 11:20:44 PST 2010


>I was running OFED-1.5.1-rc2 on my Intel iWarp cards using
>Intel MPI running IMB in a loop and I saw the following panic.
>
>
>Mar  1 10:53:03 det-17-eth0 kernel: page:ffff8101038f0370 flags:0x0100100000000074
>mapping:0000000000000000 mapcount:1 count:0 (Tainted: G     )
>Mar  1 10:53:03 det-17-eth0 kernel: Trying to fix it up, but a reboot is needed
>Mar  1 10:53:03 det-17-eth0 kernel: Backtrace:
>Mar  1 10:53:03 det-17-eth0 kernel:
>Mar  1 10:53:03 det-17-eth0 kernel: Call Trace:
>Mar  1 10:53:03 det-17-eth0 kernel:  <IRQ>  [<ffffffff800c4570>] bad_page+0x69/0x91
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8000b519>] free_hot_cold_page+0x80/0x11b
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8822166a>] :iw_nes:nes_cqp_rem_ref_callback+0xc9/0x11f
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff88225d22>] :iw_nes:nes_cqp_ce_handler+0xcd/0x237
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8822171c>] :iw_nes:nes_process_ceq+0x5c/0x74
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8009aaf2>] queue_work+0x4e/0x57
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff88224933>] :iw_nes:nes_dpc+0x117/0x1439
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8006c962>] do_IRQ+0xec/0xf5
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8005d615>] ret_from_intr+0x0/0xa
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff800923c3>] tasklet_action+0x89/0xfd
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff80011fbc>] __do_softirq+0x89/0x133
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8006cada>] do_softirq+0x2c/0x85
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8006c962>] do_IRQ+0xec/0xf5
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8005d615>] ret_from_intr+0x0/0xa
>Mar  1 10:53:03 det-17-eth0 kernel:  <EOI>  [<ffffffff8000b5cc>] fget_light+0x18/0x7c
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff80016e47>] sys_write+0x21/0x6e
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8006152f>] sysenter_tracesys+0x48/0x9d
>Mar  1 10:53:03 det-17-eth0 kernel:  [<ffffffff8006149b>] sysenter_do_call+0x1b/0x67
>Mar  1 10:53:03 det-17-eth0 kernel:
>Mar  1 10:54:55 det-17-eth0 syslogd 1.4.1: restart.
>Mar

Thanks Woody.  How many nodes + processes is this test?  Also, what is
the firmware version on your cards (grep "iw_nes: Firmware" from syslog)?
Faisal will take a look at the problem.

Chien





More information about the ewg mailing list