[openib-general] [Bug 131] New: working with huge pages may crash the kernel on Suse10
bugzilla-daemon at openib.org
bugzilla-daemon at openib.org
Sun Jun 11 05:55:54 PDT 2006
http://openib.org/bugzilla/show_bug.cgi?id=131
Summary: working with huge pages may crash the kernel on Suse10
Product: OpenFabrics Linux
Version: 1.0rc6
Platform: X86-64
OS/Version: Other
Status: NEW
Severity: normal
Priority: P2
Component: IB Core
AssignedTo: bugzilla at openib.org
ReportedBy: dotanb at mellanox.co.il
*************************************************************
Host Architecture : x86_64
Linux Distribution: SUSE LINUX 10.0 (X86-64) OSS VERSION = 10.0
Kernel Version : 2.6.13-15-smp
Memory size : 5099744 kB
Driver Version : OFED-1.0-rc6-post1
HCA ID(s) : mthca0
HCA model(s) : 25218
FW version(s) : 5.1.915
Board(s) : MT_0200000001
*************************************************************
working with huge pages may cause a kernel crash in sus10: kernel
2.6.13-15-smp.
everything was fine when we used kenels 2.6.9, 2.6.16 .
here is the back trace from the /var/log/messages:
Jun 9 15:15:03 sw030 kernel: general protection fault: 0000 [1] SMP
Jun 9 15:15:03 sw030 kernel: CPU 1
Jun 9 15:15:03 sw030 kernel: Modules linked in: rdma_ucm ib_sdp rdma_cm
ib_addr ib_cm ib_local_sa findex ib_ipoib ib_sa ib_uverbs ib_umad ib_mthca
ib_mad ib_core memtrack mst_pciconf mst_pci hfsplus vfat fat subfs freq_table
autofs4 edd ipv6 button battery ac af_packet floppy e1000 i2c_i801 i2c_core
generic ide_core ehci_hcd hw_random uhci_hcd usbcore shpchp pci_hotplug
parport_pc lp parport dm_mod ext3 jbd fan thermal processor aic79xx
scsi_transport_spi sg sr_mod cdrom ata_piix libata sd_mod scsi_mod
Jun 9 15:15:03 sw030 kernel: Pid: 1822, comm: mr_test Tainted: G U
2.6.13-15-smp
Jun 9 15:15:03 sw030 kernel: RIP: 0010:[<ffffffff8016cdf2>]
<ffffffff8016cdf2>{set_page_dirty+34}
Jun 9 15:15:03 sw030 kernel: RSP: 0018:ffff81007c5a9e20 EFLAGS: 00010286
Jun 9 15:15:03 sw030 kernel: RAX: 803d9290c7c7485b RBX: 0000000000000001 RCX:
ffff8100016cf000
Jun 9 15:15:03 sw030 kernel: RDX: ffffffff80183550 RSI: ffff8100016cf000 RDI:
ffff8100016cf038
Jun 9 15:15:03 sw030 kernel: RBP: ffff8100016cf038 R08: 0000000000001000 R09:
ffff810051568cd8
Jun 9 15:15:03 sw030 kernel: R10: 000000000000003f R11: ffffffff801dd920 R12:
0000000000000001
Jun 9 15:15:03 sw030 kernel: R13: ffff810064415ca8 R14: ffff81000dc86000 R15:
0000000000000001
Jun 9 15:15:03 sw030 kernel: FS: 00002aaaab21c0a0(0000)
GS:ffffffff8050e880(0000) knlGS:0000000000000000
Jun 9 15:15:03 sw030 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 9 15:15:03 sw030 kernel: CR2: 0000000000603018 CR3: 000000003d2df000 CR4:
00000000000006e0
Jun 9 15:15:03 sw030 kernel: Process mr_test (pid: 1822, threadinfo
ffff81007c5a8000, task ffff81007bc743f0)
Jun 9 15:15:03 sw030 kernel: Stack: ffffffff8016ce49 ffff810072047a00
0000000000000001 ffff81005259f000
Jun 9 15:15:03 sw030 kernel: ffffffff882e5c3a ffff810072047a00
ffff8100585fe000 ffff810064415cd0
Jun 9 15:15:03 sw030 kernel: ffff810064415ca8 ffff810064415c80
Jun 9 15:15:03 sw030 kernel: Call
Trace:<ffffffff8016ce49>{set_page_dirty_lock+41}
<ffffffff882e5c3a>{:ib_uverbs:__ib_umem_release+122}
Jun 9 15:15:03 sw030 kernel:
<ffffffff882e61de>{:ib_uverbs:ib_umem_release+14}
<ffffffff882e1f05>{:ib_uverbs:ib_uverbs_dereg_mr+245}
Jun 9 15:15:03 sw030 kernel: <ffffffff80284bc2>{tty_write+578}
<ffffffff882e026e>{:ib_uverbs:ib_uverbs_write+158}
Jun 9 15:15:03 sw030 kernel: <ffffffff8018c76a>{vfs_write+234}
<ffffffff8018cda3>{sys_write+83}
Jun 9 15:15:03 sw030 kernel: <ffffffff8010ed7e>{system_call+126}
Jun 9 15:15:03 sw030 kernel:
Jun 9 15:15:03 sw030 kernel: Code: 48 8b 40 20 48 85 c0 74 06 49 89 c3 41 ff
e3 e9 4a 17 02 00
Jun 9 15:15:03 sw030 kernel: RIP <ffffffff8016cdf2>{set_page_dirty+34} RSP
<ffff81007c5a9e20>
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the general
mailing list