[openib-general] [Bug 131] New: working with huge pages may crash the kernel on Suse10

bugzilla-daemon at openib.org bugzilla-daemon at openib.org
Sun Jun 11 05:55:54 PDT 2006


http://openib.org/bugzilla/show_bug.cgi?id=131

           Summary: working with huge pages may crash the kernel on Suse10
           Product: OpenFabrics Linux
           Version: 1.0rc6
          Platform: X86-64
        OS/Version: Other
            Status: NEW
          Severity: normal
          Priority: P2
         Component: IB Core
        AssignedTo: bugzilla at openib.org
        ReportedBy: dotanb at mellanox.co.il


*************************************************************
Host Architecture : x86_64
Linux Distribution: SUSE LINUX 10.0 (X86-64) OSS VERSION = 10.0
Kernel Version    : 2.6.13-15-smp
Memory size       : 5099744 kB
Driver Version    : OFED-1.0-rc6-post1
HCA ID(s)         : mthca0
HCA model(s)      : 25218
FW version(s)     : 5.1.915
Board(s)          : MT_0200000001
************************************************************* 

working with huge pages may cause a kernel crash in sus10: kernel
2.6.13-15-smp.

everything was fine when we used kenels 2.6.9, 2.6.16 .
here is the back trace from the /var/log/messages:
Jun  9 15:15:03 sw030 kernel: general protection fault: 0000 [1] SMP
Jun  9 15:15:03 sw030 kernel: CPU 1
Jun  9 15:15:03 sw030 kernel: Modules linked in: rdma_ucm ib_sdp rdma_cm
ib_addr ib_cm ib_local_sa findex ib_ipoib ib_sa ib_uverbs ib_umad ib_mthca
ib_mad ib_core memtrack mst_pciconf mst_pci hfsplus vfat fat subfs freq_table
autofs4 edd ipv6 button battery ac af_packet floppy e1000 i2c_i801 i2c_core
generic ide_core ehci_hcd hw_random uhci_hcd usbcore shpchp pci_hotplug
parport_pc lp parport dm_mod ext3 jbd fan thermal processor aic79xx
scsi_transport_spi sg sr_mod cdrom ata_piix libata sd_mod scsi_mod
Jun  9 15:15:03 sw030 kernel: Pid: 1822, comm: mr_test Tainted: G     U
2.6.13-15-smp
Jun  9 15:15:03 sw030 kernel: RIP: 0010:[<ffffffff8016cdf2>]
<ffffffff8016cdf2>{set_page_dirty+34}
Jun  9 15:15:03 sw030 kernel: RSP: 0018:ffff81007c5a9e20  EFLAGS: 00010286
Jun  9 15:15:03 sw030 kernel: RAX: 803d9290c7c7485b RBX: 0000000000000001 RCX:
ffff8100016cf000
Jun  9 15:15:03 sw030 kernel: RDX: ffffffff80183550 RSI: ffff8100016cf000 RDI:
ffff8100016cf038
Jun  9 15:15:03 sw030 kernel: RBP: ffff8100016cf038 R08: 0000000000001000 R09:
ffff810051568cd8
Jun  9 15:15:03 sw030 kernel: R10: 000000000000003f R11: ffffffff801dd920 R12:
0000000000000001
Jun  9 15:15:03 sw030 kernel: R13: ffff810064415ca8 R14: ffff81000dc86000 R15:
0000000000000001
Jun  9 15:15:03 sw030 kernel: FS:  00002aaaab21c0a0(0000)
GS:ffffffff8050e880(0000) knlGS:0000000000000000
Jun  9 15:15:03 sw030 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun  9 15:15:03 sw030 kernel: CR2: 0000000000603018 CR3: 000000003d2df000 CR4:
00000000000006e0
Jun  9 15:15:03 sw030 kernel: Process mr_test (pid: 1822, threadinfo
ffff81007c5a8000, task ffff81007bc743f0)
Jun  9 15:15:03 sw030 kernel: Stack: ffffffff8016ce49 ffff810072047a00
0000000000000001 ffff81005259f000
Jun  9 15:15:03 sw030 kernel:        ffffffff882e5c3a ffff810072047a00
ffff8100585fe000 ffff810064415cd0
Jun  9 15:15:03 sw030 kernel:        ffff810064415ca8 ffff810064415c80
Jun  9 15:15:03 sw030 kernel: Call
Trace:<ffffffff8016ce49>{set_page_dirty_lock+41}
<ffffffff882e5c3a>{:ib_uverbs:__ib_umem_release+122}
Jun  9 15:15:03 sw030 kernel:       
<ffffffff882e61de>{:ib_uverbs:ib_umem_release+14}
<ffffffff882e1f05>{:ib_uverbs:ib_uverbs_dereg_mr+245}
Jun  9 15:15:03 sw030 kernel:        <ffffffff80284bc2>{tty_write+578}
<ffffffff882e026e>{:ib_uverbs:ib_uverbs_write+158}
Jun  9 15:15:03 sw030 kernel:        <ffffffff8018c76a>{vfs_write+234}
<ffffffff8018cda3>{sys_write+83}
Jun  9 15:15:03 sw030 kernel:        <ffffffff8010ed7e>{system_call+126}
Jun  9 15:15:03 sw030 kernel:
Jun  9 15:15:03 sw030 kernel: Code: 48 8b 40 20 48 85 c0 74 06 49 89 c3 41 ff
e3 e9 4a 17 02 00
Jun  9 15:15:03 sw030 kernel: RIP <ffffffff8016cdf2>{set_page_dirty+34} RSP
<ffff81007c5a9e20>




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the general mailing list