[ewg] client crash mounting - OFED-3.12 RHEL6.5 kernel
Steve Wise
swise at opengridcomputing.com
Mon Mar 31 12:15:47 PDT 2014
The backport code has this for xprt_rdma_reserve_xprt():
static int
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,1,0))
xprt_rdma_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task)
#else
xprt_rdma_reserve_xprt(struct rpc_task *task)
#endif
{
#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,1,0))
struct rpc_xprt *xprt = task->tk_xprt;
#endif
Yet RHEL6.5, kernel version 2.6.32-431, has the newer signature:
static int
xprt_rdma_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task)
{
struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt);
So the backport is not correct for the RHEL6.5 kernel since it assume any kernel < 3.1.0
uses the older signature.
> -----Original Message-----
> From: Steve Wise [mailto:swise at opengridcomputing.com]
> Sent: Monday, March 31, 2014 1:41 PM
> To: 'jeffrey.c.becker at nasa.gov'
> Cc: Shirley Ma (shirley.ma at oracle.com); 'Tom Tucker'; Chuck Lever
> (chuck.lever at oracle.com); 'ewg at lists.openfabrics.org'
> Subject: client crash mounting - OFED-3.12 RHEL6.5 kernel
>
> Hey Jeff,
>
> Have either of you ever seen this one? I'm testing the OFED-3.12 backport for RHEL6.5
and
> see this crash immediately when mounting over iwarp:
>
> general protection fault: 0000 [#1] SMP
> last sysfs file: /sys/module/sunrpc/initstate
> CPU 1
> Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl xprtrdma(U) sunrpc tg3
> ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM
> iptable_mangle iptable_filter ip_tables bridge autofs4 8021q garp stp llc rdma_ucm(U)
> rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_uverbs(U) ib_umad(U) ocrdma(U)
> be2net(U) iw_nes(U) libcrc32c iw_cxgb4(U) iw_cxgb3(U) mlx4_en(U) mlx4_ib(U) ib_sa(U)
> mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) vhost_net macvtap macvlan tun kvm uinput
> iTCO_wdt iTCO_vendor_support dcdbas microcode serio_raw sg lpc_ich mfd_core ptp
> pps_core i5100_edac edac_core cxgb4(U) ipv6 cxgb3(U) compat(U) mdio ext4 jbd2 mbcache
> sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix radeon ttm
> drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
> [last unloaded: tg3]
>
> Pid: 3299, comm: mount.nfs Not tainted 2.6.32-431.el6.x86_64 #1 Dell Inc. PowerEdge
> R300/0TY179
> RIP: 0010:[<ffffffffa027f990>] [<ffffffffa027f990>] xprt_rdma_reserve_xprt+0x20/0xa0
> [xprtrdma]
> RSP: 0018:ffff8802142738d8 EFLAGS: 00010286
> RAX: c8aaa8c0514e0002 RBX: ffff880214128080 RCX: ffff88020b80c600
> RDX: 0000000000000000 RSI: ffff880214128080 RDI: ffff880214130000
> RBP: ffff8802142738f8 R08: e008000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff880214130000
> R13: ffff8802141304b0 R14: 0000000000000000 R15: ffffffffa071f4a0
> FS: 00007fcd6b836700(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007fcd6bbca008 CR3: 0000000214200000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process mount.nfs (pid: 3299, threadinfo ffff880214272000, task ffff88022b6a3540)
> Stack:
> ffff8802142738f8 ffff880214128080 ffff880214130000 ffff8802141304b0
> <d> ffff880214273938 ffffffffa0722b38 0000000000011250 ffff880214128080
> <d> ffff8802141280f0 0000000000000000 0000000000000000 ffffffffa071f4a0
> Call Trace:
> [<ffffffffa0722b38>] xprt_reserve+0x88/0x300 [sunrpc]
> [<ffffffffa071f4a0>] ? call_reserve+0x0/0x60 [sunrpc]
> [<ffffffffa071f4d4>] call_reserve+0x34/0x60 [sunrpc]
> [<ffffffffa072a617>] __rpc_execute+0x77/0x350 [sunrpc]
> [<ffffffff8109b127>] ? bit_waitqueue+0x17/0xd0
> [<ffffffffa072a951>] rpc_execute+0x61/0xa0 [sunrpc]
> [<ffffffffa07213a5>] rpc_run_task+0x75/0x90 [sunrpc]
> [<ffffffffa07214c2>] rpc_call_sync+0x42/0x70 [sunrpc]
> [<ffffffffa0721542>] rpc_ping+0x52/0x70 [sunrpc]
> [<ffffffffa0721eb8>] rpc_create+0x458/0x5b0 [sunrpc]
> [<ffffffff8116ec9b>] ? cache_alloc_refill+0x15b/0x240
> [<ffffffffa07b4cbb>] nfs_create_rpc_client+0xcb/0x110 [nfs]
> [<ffffffffa07b509c>] nfs_init_client+0x4c/0xb0 [nfs]
> [<ffffffffa07b566a>] nfs_get_client+0x4ba/0x590 [nfs]
> [<ffffffff81297c36>] ? __percpu_counter_init+0x56/0x70
> [<ffffffffa072b1d0>] ? __rpc_init_priority_wait_queue+0x80/0xb0 [sunrpc]
> [<ffffffffa07b6a8f>] nfs_create_server+0xcf/0x590 [nfs]
> [<ffffffffa07c3cac>] nfs_get_sb+0x2dc/0x880 [nfs]
> [<ffffffff8118b7fb>] vfs_kern_mount+0x7b/0x1b0
> [<ffffffff8118b9a2>] do_kern_mount+0x52/0x130
> [<ffffffff811ac94b>] do_mount+0x2fb/0x930
> [<ffffffff811ad010>] sys_mount+0x90/0xe0
> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
> Code: c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0
4c 89
> 6d f8 0f 1f 44 00 00 48 8b 47 18 49 89 fc <48> 8b 58 30 48 8b 93 38 07 00 00 44 8b ab 34
07
> 00 00 48 85 d2
> RIP [<ffffffffa027f990>] xprt_rdma_reserve_xprt+0x20/0xa0 [xprtrdma]
> RSP <ffff8802142738d8>
> crash>
More information about the ewg
mailing list