[openib-general] nfsrdma server stop responding,

James Lentini jlentini at netapp.com
Tue Dec 12 10:51:23 PST 2006


It appears that one or more of the receive work requests is completing 
in error. The crash occurs when the server attempts to cleanup the 
buffer associated with the work request.

I'd like to know why receives are failing. What is the error? Do your 
logs contain the printk on net/sunrpc/svc_rdma_recvfrom.c:522 
"svcrdma: bad WR completion..."? If they do not, you can turn on 
SVCRDMA_DEBUG (echo 4096 > /proc/sys/sunrpc/rpc_debug).

james

On Tue, 12 Dec 2006, Vu Pham wrote:

> James,
>   Another variation of put_page problem. I have stopped doing I/O or 
> accessing the mounted directory since last night. This morning I 
> just try to do *ls* the mounted directory and get this error
> 
> -vu
> 
> > James,
> >   I hit another variation of put_page problem. I just ran iozone with 9 GB
> > file size (both client and server machines have 8 GB of memory, dual
> > woodcrest xeon cpus, 2.6.18.5 kernel, nfsrdma release 7)
> > 
> > After this happened other nfsrdma clients can still do I/O to the server
> > 
> > -vu
> > 
> > > Hit *send* too soon - here is the objdump of swap.o
> > > 
> > > -vu
> > > 
> > > 
> > > > James Lentini wrote:
> > > > > A couple of questions Vu:
> > > > > 
> > > > > What NFS-RDMA release are you using? This looks like release 7.
> > > > > 
> > > > 
> > > > Yes. I'm using release 7
> > > > 
> > > > > Is this reproducible?
> > > > 
> > > > I ran into it twice - I think that it may co-relate to openSM restart
> > > > incident. I'll double check it and confirm
> > > > 
> > > > 
> > > > > What kernel version are you using?
> > > > 
> > > > 2.6.18.5
> > > > 
> > > > > What hardware is this on? It looks like x86-64 to me, which is fine. I
> > > > > just want to be sure I know what I'm looking at. As many specifics as
> > > > > possible is good (number of CPUs, hyperthreading, etc.)
> > > > > 
> > > > 
> > > > Dual woodcrest xeon based CPUs
> > > > 
> > > > > Could you send the output of
> > > > > objdump -Slr /path/to/kernel/mm/swap.o
> > > > > 
> > > > 
> > > > I attached the objdump output here
> > > > 
> > > > > Actually, just the put_page disassembly is all I want to see.
> > > > > 
> > > > > Is there any more text available? Usually there is an explanation
> > > > > given for an oops message (e.g. "Unable to handle kernel paging
> > > > > request..").
> > > > > 
> > > > 
> > > > I did not see any oops text message. System was still responsive with
> > > > ipoib ping or login
> > > > 
> > > > 
> > > > > I opened a bug at the NFS-RDMA SourceForge project to track this:
> > > > > 
> > > > > http://sourceforge.net/tracker/index.php?func=detail&aid=1613201&group_id=97628&atid=618583 
> > > > 
> > > > thanks for your help,
> > > > 
> > > > -vu
> > > > 
> > > > > Thanks for reporting this.
> > > > > james
> > > > > 
> > > > > On Fri, 8 Dec 2006, Vu Pham wrote:
> > > > > 
> > > > > > Hi James,
> > > > > >   I got these errors in server's /var/log/messages and then the
> > > > > > server stop
> > > > > > responding to login, I/O...; however, the server is still up, ipoib
> > > > > > is still
> > > > > > working
> > > > > > 
> > > > > > 
> > > > > > Dec  8 06:38:21 ibd201 kernel: RIP: 0010:[<ffffffff8025dff7>]
> > > > > > [<ffffffff8025dff7>] put_page+0x17/0x40
> > > > > > Dec  8 06:38:21 ibd201 kernel: RSP: 0018:ffff810219ddfb08  EFLAGS:
> > > > > > 00010246
> > > > > > Dec  8 06:38:21 ibd201 kernel: RAX: 0000000000000000 RBX:
> > > > > > 0000000000000001
> > > > > > RCX: 000000000003ffff
> > > > > > Dec  8 06:38:21 ibd201 kernel: RDX: 0000000000000000 RSI:
> > > > > > 0000000000000001
> > > > > > RDI: ffff8102274e92f8
> > > > > > Dec  8 06:38:21 ibd201 kernel: RBP: ffff8101ab785000 R08:
> > > > > > 0000000000000034
> > > > > > R09: 0000000000000000
> > > > > > Dec  8 06:38:21 ibd201 kernel: R10: 0000000000000000 R11:
> > > > > > 0000000000000000
> > > > > > R12: ffff81020ef96800
> > > > > > Dec  8 06:38:21 ibd201 kernel: R13: ffff8101ab785000 R14:
> > > > > > 0000000000000000
> > > > > > R15: ffff8102053ee890
> > > > > > Dec  8 06:38:21 ibd201 kernel: FS:  00002ad76b8acb00(0000)
> > > > > > GS:ffff81022066eb40(0000) knlGS:0000000000000000
> > > > > > Dec  8 06:38:21 ibd201 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> > > > > > 000000008005003b
> > > > > > Dec  8 06:38:21 ibd201 kernel: CR2: 00002aaaaabf1000 CR3:
> > > > > > 000000021c22b000
> > > > > > CR4: 00000000000006e0
> > > > > > Dec  8 06:38:21 ibd201 kernel: Process nfsd (pid: 15038, threadinfo
> > > > > > ffff810219dde000, task ffff81020d87f0c0)
> > > > > > Dec  8 06:38:21 ibd201 kernel: Stack:  ffffffff8835e547
> > > > > > ffff81020ef96968
> > > > > > ffff81020ef96800 ffff81020ef96958
> > > > > > Dec  8 06:38:21 ibd201 kernel:  ffffffff88360c72 000000010395dc90
> > > > > > ffffffff80424e05 0000000000000000
> > > > > > Dec  8 06:38:21 ibd201 kernel:  0000000000200200 000000010395dc90
> > > > > > ffffffff80239b90 ffff81020d87f0c0
> > > > > > Dec  8 06:38:21 ibd201 kernel: Call Trace:
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8835e547>]
> > > > > > :sunrpc:svc_rdma_put_context+0x37/0xd0
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff88360c72>]
> > > > > > :sunrpc:svc_rdma_recvfrom+0x5a2/0x11e0
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80424e05>]
> > > > > > schedule_timeout+0x95/0xb0
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80239b90>]
> > > > > > process_timeout+0x0/0x10
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80423c2d>]
> > > > > > wait_for_completion_timeout+0xcd/0x150
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>]
> > > > > > default_wake_function+0x0/0x10
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff881c1402>]
> > > > > > :ib_mthca:mthca_cmd_post+0x232/0x260
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>]
> > > > > > default_wake_function+0x0/0x10
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff802fac39>]
> > > > > > __next_cpu+0x19/0x30
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80227dae>]
> > > > > > find_busiest_group+0x24e/0x6d0
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80424772>]
> > > > > > thread_return+0x0/0xde
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff804263f8>]
> > > > > > _spin_unlock_irqrestore+0x8/0x10
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8023a331>]
> > > > > > try_to_del_timer_sync+0x51/0x60
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8023a34c>]
> > > > > > del_timer_sync+0xc/0x20
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80424e05>]
> > > > > > schedule_timeout+0x95/0xb0
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883559e6>]
> > > > > > :sunrpc:svc_recv+0x416/0x510
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>]
> > > > > > default_wake_function+0x0/0x10
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>]
> > > > > > default_wake_function+0x0/0x10
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9540>]
> > > > > > :nfsd:nfsd+0x0/0x380
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9651>]
> > > > > > :nfsd:nfsd+0x111/0x380
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8020ab9c>]
> > > > > > child_rip+0xa/0x12
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9540>]
> > > > > > :nfsd:nfsd+0x0/0x380
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9540>]
> > > > > > :nfsd:nfsd+0x0/0x380
> > > > > > Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8020ab92>]
> > > > > > child_rip+0x0/0x12
> > > > > > Dec  8 06:38:21 ibd201 kernel:
> > > > > > Dec  8 06:38:21 ibd201 kernel:
> > > > > > Dec  8 06:38:21 ibd201 kernel: Code: 0f 0b 68 8c 41 45 80 c2 2c 01
> > > > > > f0 ff 4f 08
> > > > > > 0f 94 c0 84 c0 74
> > > > > > Dec  8 06:38:21 ibd201 kernel: RIP  [<ffffffff8025dff7>]
> > > > > > put_page+0x17/0x40
> > > > > > Dec  8 06:38:21 ibd201 kernel:  RSP <ffff810219ddfb08>
> > > > > > 
> > > > > > -vu
> > > > > > 
> > > > 
> > > > 
> > 
> > ------------------------------------------------------------------------
> > 
> > <snip>
> > 
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81012596b800, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81012596bc00, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17000, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17800, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17c00, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de000, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de800, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7dec00, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39000, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39800, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39c00, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf000, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: ----------- [cut here ] --------- [please
> > bite here ] ---------
> > Dec 12 01:09:30 ibd202 kernel: Kernel BUG at include/linux/mm.h:300
> > Dec 12 01:09:30 ibd202 kernel: invalid opcode: 0000 [1] SMP Dec 12 01:09:30
> > ibd202 kernel: CPU 1 Dec 12 01:09:30 ibd202 kernel: Modules linked in: nfsd
> > exportfs lockd nfs_acl ipv6 autofs4 sunrpc rdma_cm ib_addr dm_mirror dm_mod
> > button battery asus_acpi ac uhci_hcd ehci_hcd i2c_i801 i2c_core ib_mthca
> > shpchp ib_ipoib ib_umad ib_ucm ib_uverbs ib_cm ib_sa ib_mad ib_core e1000
> > floppy ext3 jbd megaraid_sas sd_mod scsi_mod
> > Dec 12 01:09:30 ibd202 kernel: Pid: 4343, comm: nfsd Not tainted 2.6.18.5 #1
> > Dec 12 01:09:30 ibd202 kernel: RIP: 0010:[<ffffffff8025892b>]
> > [<ffffffff8025892b>] put_page+0x13/0x2e
> > Dec 12 01:09:30 ibd202 kernel: RSP: 0018:ffff81023fd11b08  EFLAGS: 00010246
> > Dec 12 01:09:30 ibd202 kernel: RAX: 0000000000000000 RBX: 0000000000000001
> > RCX: 0000000000006a53
> > Dec 12 01:09:30 ibd202 kernel: RDX: 00000000ffffff01 RSI: 0000000000000001
> > RDI: ffff81024fc3dec0
> > Dec 12 01:09:30 ibd202 kernel: RBP: ffff81023e4cf400 R08: 0000000000000001
> > R09: 0000000000000000
> > Dec 12 01:09:30 ibd202 kernel: R10: 0000000000000000 R11: ffffffff88185ac8
> > R12: ffff810240fb3800
> > Dec 12 01:09:30 ibd202 kernel: R13: ffff810240fb3800 R14: ffff81023d045400
> > R15: 00000000000dbba0
> > Dec 12 01:09:30 ibd202 kernel: FS:  00002ad030296b00(0000)
> > GS:ffff81024688eac0(0000) knlGS:0000000000000000
> > Dec 12 01:09:30 ibd202 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> > 000000008005003b
> > Dec 12 01:09:30 ibd202 kernel: CR2: 00002b70add7aad8 CR3: 000000023ebd3000
> > CR4: 00000000000006e0
> > Dec 12 01:09:30 ibd202 kernel: Process nfsd (pid: 4343, threadinfo
> > ffff81023fd10000, task ffff810246562840)
> > Dec 12 01:09:30 ibd202 kernel: Stack:  ffffffff8817b2fb ffff810240fb39b8
> > 0000000000000000 ffff81024172c5b0
> > Dec 12 01:09:30 ibd202 kernel:  ffffffff8817ec67 ffff81023cda7000
> > ffffffff8817b2a8 0000000000000000
> > Dec 12 01:09:30 ibd202 kernel:  ffff81023fd11ca0 ffff81023fd11b80
> > 0000000000000001 ffff81023cda7000
> > Dec 12 01:09:30 ibd202 kernel: Call Trace:
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817b2fb>]
> > :sunrpc:svc_rdma_put_context+0x37/0xb5
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817ec67>]
> > :sunrpc:svc_rdma_recvfrom+0x58f/0x1150
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817b2a8>]
> > :sunrpc:svc_rdma_get_context+0x10c/0x128
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817d5b8>]
> > :sunrpc:send_write+0x200/0x22c
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80254954>]
> > generic_file_readv+0x8e/0xa7
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8025ba92>]
> > zone_statistics+0x40/0x70
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80224401>]
> > find_busiest_group+0x21f/0x66f
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8042a2e9>]
> > _spin_unlock_irq+0x6/0xa
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff804285a3>] thread_return+0x64/0xec
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8042a259>]
> > _spin_lock_irqsave+0x9/0xe
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233574>]
> > lock_timer_base+0x1b/0x3c
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233776>]
> > try_to_del_timer_sync+0x4a/0x51
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233789>] del_timer_sync+0xc/0x16
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80428f6a>]
> > schedule_timeout+0x92/0xad
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff88174070>]
> > :sunrpc:svc_recv+0x3c5/0x4be
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80225264>]
> > default_wake_function+0x0/0xe
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80225264>]
> > default_wake_function+0x0/0xe
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff88204407>] :nfsd:nfsd+0x10d/0x359
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8020a4ac>] child_rip+0xa/0x12
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8020a4a2>] child_rip+0x0/0x12
> > Dec 12 01:09:30 ibd202 kernel: Dec 12 01:09:30 ibd202 kernel: Dec 12
> > 01:09:30 ibd202 kernel: Code: 0f 0b 68 16 4d 45 80 c2 2c 01 f0 ff 4f 08 0f
> > 94 c0 84 c0 74 Dec 12 01:09:30 ibd202 kernel: RIP  [<ffffffff8025892b>]
> > put_page+0x13/0x2e
> > Dec 12 01:09:30 ibd202 kernel:  RSP <ffff81023fd11b08>
> > Dec 12 01:09:30 ibd202 kernel:  <4>nfsd: terminating on error 22
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81012596b800, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81012596bc00, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17000, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17800, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17c00, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de000, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de800, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7dec00, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39000, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39800, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39c00, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf000, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> > Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf400, count=1 on
> > xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> > Dec 12 01:09:30 ibd202 kernel: ----------- [cut here ] --------- [please
> > bite here ] ---------
> > Dec 12 01:09:30 ibd202 kernel: Kernel BUG at include/linux/mm.h:300
> > Dec 12 01:09:30 ibd202 kernel: invalid opcode: 0000 [1] SMP Dec 12 01:09:30
> > ibd202 kernel: CPU 1 Dec 12 01:09:30 ibd202 kernel: Modules linked in: nfsd
> > exportfs lockd nfs_acl ipv6 autofs4 sunrpc rdma_cm ib_addr dm_mirror dm_mod
> > button battery asus_acpi ac uhci_hcd ehci_hcd i2c_i801 i2c_core ib_mthca
> > shpchp ib_ipoib ib_umad ib_ucm ib_uverbs ib_cm ib_sa ib_mad ib_core e1000
> > floppy ext3 jbd megaraid_sas sd_mod scsi_mod
> > Dec 12 01:09:30 ibd202 kernel: Pid: 4343, comm: nfsd Not tainted 2.6.18.5 #1
> > Dec 12 01:09:30 ibd202 kernel: RIP: 0010:[<ffffffff8025892b>]
> > [<ffffffff8025892b>] put_page+0x13/0x2e
> > Dec 12 01:09:30 ibd202 kernel: RSP: 0018:ffff81023fd11b08  EFLAGS: 00010246
> > Dec 12 01:09:30 ibd202 kernel: RAX: 0000000000000000 RBX: 0000000000000001
> > RCX: 0000000000006a53
> > Dec 12 01:09:30 ibd202 kernel: RDX: 00000000ffffff01 RSI: 0000000000000001
> > RDI: ffff81024fc3dec0
> > Dec 12 01:09:30 ibd202 kernel: RBP: ffff81023e4cf400 R08: 0000000000000001
> > R09: 0000000000000000
> > Dec 12 01:09:30 ibd202 kernel: R10: 0000000000000000 R11: ffffffff88185ac8
> > R12: ffff810240fb3800
> > Dec 12 01:09:30 ibd202 kernel: R13: ffff810240fb3800 R14: ffff81023d045400
> > R15: 00000000000dbba0
> > Dec 12 01:09:30 ibd202 kernel: FS:  00002ad030296b00(0000)
> > GS:ffff81024688eac0(0000) knlGS:0000000000000000
> > Dec 12 01:09:30 ibd202 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> > 000000008005003b
> > Dec 12 01:09:30 ibd202 kernel: CR2: 00002b70add7aad8 CR3: 000000023ebd3000
> > CR4: 00000000000006e0
> > Dec 12 01:09:30 ibd202 kernel: Process nfsd (pid: 4343, threadinfo
> > ffff81023fd10000, task ffff810246562840)
> > Dec 12 01:09:30 ibd202 kernel: Stack:  ffffffff8817b2fb ffff810240fb39b8
> > 0000000000000000 ffff81024172c5b0
> > Dec 12 01:09:30 ibd202 kernel:  ffffffff8817ec67 ffff81023cda7000
> > ffffffff8817b2a8 0000000000000000
> > Dec 12 01:09:30 ibd202 kernel:  ffff81023fd11ca0 ffff81023fd11b80
> > 0000000000000001 ffff81023cda7000
> > Dec 12 01:09:30 ibd202 kernel: Call Trace:
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817b2fb>]
> > :sunrpc:svc_rdma_put_context+0x37/0xb5
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817ec67>]
> > :sunrpc:svc_rdma_recvfrom+0x58f/0x1150
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817b2a8>]
> > :sunrpc:svc_rdma_get_context+0x10c/0x128
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817d5b8>]
> > :sunrpc:send_write+0x200/0x22c
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80254954>]
> > generic_file_readv+0x8e/0xa7
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8025ba92>]
> > zone_statistics+0x40/0x70
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80224401>]
> > find_busiest_group+0x21f/0x66f
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8042a2e9>]
> > _spin_unlock_irq+0x6/0xa
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff804285a3>] thread_return+0x64/0xec
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8042a259>]
> > _spin_lock_irqsave+0x9/0xe
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233574>]
> > lock_timer_base+0x1b/0x3c
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233776>]
> > try_to_del_timer_sync+0x4a/0x51
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233789>] del_timer_sync+0xc/0x16
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80428f6a>]
> > schedule_timeout+0x92/0xad
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff88174070>]
> > :sunrpc:svc_recv+0x3c5/0x4be
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80225264>]
> > default_wake_function+0x0/0xe
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80225264>]
> > default_wake_function+0x0/0xe
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff88204407>] :nfsd:nfsd+0x10d/0x359
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8020a4ac>] child_rip+0xa/0x12
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> > Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8020a4a2>] child_rip+0x0/0x12
> > Dec 12 01:09:30 ibd202 kernel: Dec 12 01:09:30 ibd202 kernel: Dec 12
> > 01:09:30 ibd202 kernel: Code: 0f 0b 68 16 4d 45 80 c2 2c 01 f0 ff 4f 08 0f
> > 94 c0 84 c0 74 Dec 12 01:09:30 ibd202 kernel: RIP  [<ffffffff8025892b>]
> > put_page+0x13/0x2e
> > Dec 12 01:09:30 ibd202 kernel:  RSP <ffff81023fd11b08>
> > Dec 12 01:09:30 ibd202 kernel:  <4>nfsd: terminating on error 22
> > 
> > 
> > ------------------------------------------------------------------------
> > 
> > _______________________________________________
> > openib-general mailing list
> > openib-general at openib.org
> > http://openib.org/mailman/listinfo/openib-general
> > 
> > To unsubscribe, please visit
> > http://openib.org/mailman/listinfo/openib-general
> 
> 




More information about the general mailing list