[ewg] crash in mthca with ofed-1.4.1

Eli Cohen eli at dev.mellanox.co.il
Thu Mar 5 01:23:26 PST 2009


This is a bug inserted in OFED but the patch was removed the next day.

On Wed, Mar 4, 2009 at 7:03 PM, Steve Wise <swise at opengridcomputing.com> wrote:
> I just put on the 20090303-0200 kernel build and hit this on boot up when
> loading mthca.  Is this a new bug?
>
>
> Distro:  RH5.2
> HW: x86_64
> OFA Drivers: ofa_1_4_kernel-20090303-0200
>
>
> ----
>
>
> divide error: 0000 [1] SMP
> last sysfs file: /block/sda/sda1/dev
> CPU 0
> Modules linked in: sg ib_mthca(U) cxgb3(U) i2c_i801 ide_cd ib_mad(U)
> ib_core(U) shpchp e1000e pcspkr i2c_core serio_raw cdrom dm_snapshot dm_zero
> dm_mirror dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd
> ehci_hcd
> Pid: 830, comm: modprobe Tainted: G      2.6.18-92.el5 #1
> RIP: 0010:[<ffffffff8821ba09>]  [<ffffffff8821ba09>]
> :ib_mthca:mthca_QUERY_DEV_LIM+0x140/0x817
> RSP: 0018:ffff81018d7a9998  EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff81018be0c000 RCX: 0000000000000004
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff81018f7bf5b0
> RBP: ffff81018d7a9b38 R08: 0000000000000000 R09: 0000000000000003
> R10: ffff81018d7a99ff R11: 000000000000ea60 R12: ffff81018d7a99ff
> R13: ffff81018d7a9d35 R14: 000000000000ea60 R15: ffff81018f7bf000
> FS:  00002afa157a1240(0000) GS:ffffffff8039e000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000001486e528 CR3: 000000018d777000 CR4: 00000000000006e0
> Process modprobe (pid: 830, threadinfo ffff81018d7a8000, task
> ffff81018d3c6080)
> Stack:  ffff81018f7bf000 ffff81018d80fb80 000000008d7a9d35 ffff81018f7bf000
> ffff81018d7a9b38 ffff81018d7a9b38 ffff81018d7a9d35 ffff81018fa55000
> 0000000000000002 ffffffff882182b7 0000000000000000 ffffffff8821a982
> Call Trace:
> [<ffffffff882182b7>] :ib_mthca:mthca_dev_lim+0x1e/0x36c
> [<ffffffff8821a982>] :ib_mthca:mthca_cmd+0x1b/0x20
> [<ffffffff8821891a>] :ib_mthca:mthca_init_hca+0x315/0xe38
> [<ffffffff80021f44>] __up_read+0x19/0x7f
> [<ffffffff80066852>] do_page_fault+0x4fe/0x830
> [<ffffffff80142f55>] __next_cpu+0x19/0x28
> [<ffffffff800790f1>] physflat_send_IPI_allbutself+0x41/0x46
> [<ffffffff800790f1>] physflat_send_IPI_allbutself+0x41/0x46
> [<ffffffff80075470>] __smp_call_function+0x62/0x8b
> [<ffffffff8007537b>] do_flush_tlb_all+0x0/0x6a
> [<ffffffff800755a6>] smp_call_function+0x32/0x47
> [<ffffffff801026e1>] sysfs_new_dirent+0x63/0x6f
> [<ffffffff80102761>] sysfs_make_dirent+0x1b/0x8f
> [<ffffffff8010208e>] sysfs_add_file+0x76/0x85
> [<ffffffff801ada1a>] device_create_file+0x31/0x39
> [<ffffffff801b22f7>] dma_pool_create+0x113/0x14b
> [<ffffffff88219e1f>] :ib_mthca:__mthca_init_one+0x50a/0x793
> [<ffffffff8821a120>] :ib_mthca:mthca_init_one+0x78/0x8d
> [<ffffffff8014f559>] pci_device_probe+0x100/0x180
> [<ffffffff801af29b>] driver_probe_device+0x52/0xaa
> [<ffffffff801af3ca>] __driver_attach+0x65/0xb6
> [<ffffffff801af365>] __driver_attach+0x0/0xb6
> [<ffffffff801aecdc>] bus_for_each_dev+0x43/0x6e
> [<ffffffff801ae922>] bus_add_driver+0x7e/0x130
> [<ffffffff8014f731>] __pci_register_driver+0x4b/0x6c
> [<ffffffff8818d1bc>] :ib_mthca:mthca_init+0x15b/0x16e
> [<ffffffff800a3dd0>] sys_init_module+0xaf/0x1e8
> [<ffffffff8005d116>] system_call+0x7e/0x83
>
>
> Code: 48 f7 f6 eb 07 c0 ea 04 88 d1 d3 e0 89 45 38 0f b6 4b 21 41
> RIP  [<ffffffff8821ba09>] :ib_mthca:mthca_QUERY_DEV_LIM+0x140/0x817
> RSP <ffff81018d7a9998>
> <0>Kernel panic - not syncing: Fatal exception
>
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>



More information about the ewg mailing list