[ofa-general] ipath crash

Robert Pearson rpearson at systemfabricworks.com
Mon Nov 26 23:17:26 PST 2007


Good to know. The system was a RHEL 5 based system.

-----Original Message-----
From: Ralph Campbell [mailto:ralph.campbell at qlogic.com] 
Sent: Monday, November 26, 2007 1:35 PM
To: Robert Pearson
Cc: openib-general at openib.org; 'Arthur Jones'
Subject: Re: [ofa-general] ipath crash

2.6.18 has a bug in the vmalloc_user() code which causes this.
The thing to do is use a new version of the kernel (2.6.20+ I think).

On Mon, 2007-11-26 at 11:37 -0600, Robert Pearson wrote:
> Here is the right crash
> 
>  
> 
> ----------- [cut here ] --------- [please bite here ] ---------
> 
> Kernel BUG at mm/slab.c:2649
> 
> invalid opcode: 0000 [1] SMP
> 
> last sysfs file: /class/infiniband/ipath0/node_type
> 
> CPU 7
> 
> Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc
> rdma_ucm(U) ib_srp(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_add
> 
> r(U) ib_uverbs(U) ib_umad(U) ib_mthca(U) ib_ipoib(U) ib_cm(U) ib_sa(U)
> ib_mad(U) ip_conntrack_netbios_ns ipt_REJECT xt_s
> 
> tate ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT
> xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 dm_m
> 
> irror dm_mod video sbs i2c_ec i2c_core button battery asus_acpi
> acpi_memhotplug ac parport_pc lp parport sg ib_ipath(U)
> 
> ide_cd ib_core(U) serio_raw cdrom bnx2 shpchp pcspkr mptsas mptscsih
> mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd
> 
>  ehci_hcd ohci_hcd uhci_hcd
> 
> Pid: 8101, comm: fragment Not tainted 2.6.18-8.1.15.el5 #1
> 
> RIP: 0010:[<ffffffff80016ebb>]  [<ffffffff80016ebb>] cache_grow
> +0x1e/0x395
> 
> RSP: 0018:ffff810010c3dcb8  EFLAGS: 00010006
> 
> RAX: 0000000000000000 RBX: 00000000000080d0 RCX: 00000000ffffffff
> 
> RDX: 0000000000000000 RSI: 00000000000080d0 RDI: ffff810037ff43c0
> 
> RBP: ffff81003ffa06e0 R08: ffff8100020bc280 R09: ffff810037e64400
> 
> R10: ffff810010c3de68 R11: 000000000000555c R12: ffff810037ff43c0
> 
> R13: ffff81003ffa06c0 R14: 0000000000000000 R15: ffff810037ff43c0
> 
> FS:  00002aaaaaad7440(0000) GS:ffff8100020bf340(0000)
> knlGS:0000000000000000
> 
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> 
> CR2: 00002aaaaaaac000 CR3: 0000000011a7f000 CR4: 00000000000006e0
> 
> Process fragment (pid: 8101, threadinfo ffff810010c3c000, task
> ffff81002cdd3820)
> 
> Stack:  0000000000000000 0000000000000001 0000000000000296
> 0000000000000001
> 
>  ffff810010c3dd18 00000000ffffffff ffff81003ffa06e0 ffff8100020bc280
> 
>  ffff81003ffa06c0 000000000000000c ffff810037ff43c0 ffffffff8005a5ce
> 
> Call Trace:
> 
>  [<ffffffff8005a5ce>] cache_alloc_refill+0x136/0x186
> 
>  [<ffffffff800cc5dc>] kmem_cache_alloc_node+0x98/0xb2
> 
>  [<ffffffff800c2ae8>] __vmalloc_area_node+0x62/0x153
> 
>  [<ffffffff800c2e36>] vmalloc_user+0x15/0x50
> 
>  [<ffffffff88180579>] :ib_ipath:ipath_create_cq+0x67/0x1d6
> 
>  [<ffffffff80062126>] __down_write_nested+0x12/0x92
> 
>  [<ffffffff884266cd>] :ib_uverbs:ib_uverbs_create_cq+0x143/0x259
> 
>  [<ffffffff884231ce>] :ib_uverbs:ib_uverbs_write+0x93/0xa9
> 
>  [<ffffffff8011a55d>] selinux_file_permission+0x9f/0xb6
> 
>  [<ffffffff80016122>] vfs_write+0xce/0x174
> 
>  [<ffffffff800169b3>] sys_write+0x45/0x6e
> 
>  [<ffffffff8005b349>] tracesys+0xd1/0xdc
> 
>  
> 
> The last one was from an older crash that I picked up by mistake.
> 
>  
> 
> Bob
> 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general



More information about the general mailing list