[ofa-general] kernel panic (sporadically) OFED-1.2

Frank Mietke frank.mietke at informatik.tu-chemnitz.de
Fri Mar 14 09:08:42 PDT 2008


Hi,

has anybody seen the kernel panic below? Any hints? 

We're using RHEL-4 clone with special Lustre kernel
2.6.9-55.0.9.EL_lustre.1.6.4.2smp and OFED-1.2

ib0: queue stopped 1, tx_head 102359453, tx_tail 102359399
^Mib0: transmit timeout: latency 1010 msecs
^Mib0: queue stopped 1, tx_head 102359503, tx_tail 102359453
^Mib0: transmit timeout: latency 1140 msecs
^Mib0: queue stopped 1, tx_head 102359542, tx_tail 102359497
^Mib0: transmit timeout: latency 1350 msecs
^Mib0: queue stopped 1, tx_head 102359624, tx_tail 102359561
^Mgeneral protection fault: 0000 [1] SMP 
^MCPU 3 
^MModules linked in: osc(U) mgc(U) lustre(U) lov(U) lquota(U) mdc(U) ko2iblnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ib_mthca(U) tg3(U) sata_svw(U)
ib_sdp(U) ib_srp(U) rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_local_sa(U) ib_addr(U)
ib_ipoib(U) ipv6(U) ib_umad(U) ib_ucm(U) ib_cm(U) ib_sa(U) ib_mad(U)
ib_uverbs(U) ib_core(U) sd_mod(U) ipmi_devintf(U) ipmi_si(U) ipmi_msghandler(U)
ext3(U) jbd(U) nfs(U) lockd(U) nfs_acl(U) sunrpc(U) ohci_hcd(U) ehci_hcd(U)
^MPid: 843, comm: ib_mad1 Tainted: GF     2.6.9-55.0.9.EL_lustre.1.6.4.2smp
^MRIP: 0010:[<ffffffffa01d1c08>]
<ffffffffa01d1c08>{:ib_mthca:mthca_ah_grh_present+4}
^MRSP: 0018:000001007cf5dcb0  EFLAGS: 00010046
^MRAX: 0000ffff00000001 RBX: 000001012d971b80 RCX: 000001006e0526c0
^MRDX: 000000000000002c RSI: 000001012d971a00 RDI: 000001007183a380
^MRBP: 000001012d971a00 R08: 000001007cf2c600 R09: 000001007cf2c610
^MR10: 0000000000000004 R11: 0000000000000246 R12: 000001007cf2c600
^MR13: 000001006e0526c0 R14: 000001007d4ef000 R15: 000001007cf2c610
^MFS:  0000002a95e266e0(0000) GS:ffffffff804a6880(0000) knlGS:0000000000000000
^MCS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
^MCR2: 0000002a9556c000 CR3: 0000000080ca0000 CR4: 00000000000006e0
^MProcess ib_mad1 (pid: 843, threadinfo 000001007cf5c000, task 000001012cc77800)
^MStack: ffffffffa01cf91a 000000000000002c 000001007cf2c610 000001007cf2c580 
^M       000001012d8fc5e0 000001006e0526c0 000001012d971a00 000000000000002c 
^M       ffffffffa01d0fd8 000001012d8fc5e0 
^MCall Trace:<ffffffffa01cf91a>{:ib_mthca:build_mlx_header+45}
<ffffffffa01d0fd8>{:ib_mthca:mthca_arbel_post_send+1558} 
^M       <ffffffffa00f99ce>{:ib_mad:ib_mad_send_done_handler+354} 
^M       <ffffffffa00f9f5b>{:ib_mad:ib_mad_completion_handler+1370} 
^M       <ffffffffa00f9a01>{:ib_mad:ib_mad_completion_handler+0} 
^M       <ffffffff80146fca>{worker_thread+419}
<ffffffff80133566>{default_wake_function+0} 
^M       <ffffffff801335b7>{__wake_up_common+67}
<ffffffff80133566>{default_wake_function+0} 
^M       <ffffffff8014ad18>{keventd_create_kthread+0}
<ffffffff80146e27>{worker_thread+0} 
^M       <ffffffff8014ad18>{keventd_create_kthread+0}
<ffffffff8014acef>{kthread+200} 
^M       <ffffffff80110de3>{child_rip+8}
<ffffffff8014ad18>{keventd_create_kthread+0} 
^M       <ffffffff8014ac27>{kthread+0} <ffffffff80110ddb>{child_rip+0} 
^M       

^MCode: 0f be 40 05 c1 e8 1f c3 41 54 b8 ea ff ff ff 49 89 fc 55 48 
^MRIP <ffffffffa01d1c08>{:ib_mthca:mthca_ah_grh_present+4} RSP
<000001007cf5dcb0>
^M <6>NETDEV WATCHDOG: ib0: transmit timed out
^Mib0: transmit timeout: latency 1970 msecs
^Mib0: queue stopped 1, tx_head 102359700, tx_tail 102359664
^MKernel panic - not syncing: Oops


Best Regards,
Frank


-- 
Dipl.-Inf. Frank Mietke     |     Fakultätsrechen- und Informationszentrum
Tel.: 0371 - 531 - 35538    |     Fak. für Informatik
Fax:  0371 - 531 8 35538    |     TU-Chemnitz
Key-ID: 60F59599            |     frank.mietke at informatik.tu-chemnitz.de



More information about the general mailing list