[openib-general] nfsrdma server stop responding,

Vu Pham vuhuong at mellanox.com
Tue Dec 12 09:46:47 PST 2006


James,
   Another variation of put_page problem. I have stopped 
doing I/O or accessing the mounted directory since last 
night. This morning I just try to do *ls* the mounted 
directory and get this error

-vu

> James,
>   I hit another variation of put_page problem. I just ran iozone with 9 
> GB file size (both client and server machines have 8 GB of memory, dual 
> woodcrest xeon cpus, 2.6.18.5 kernel, nfsrdma release 7)
> 
> After this happened other nfsrdma clients can still do I/O to the server
> 
> -vu
> 
>> Hit *send* too soon - here is the objdump of swap.o
>>
>> -vu
>>
>>
>>> James Lentini wrote:
>>>> A couple of questions Vu:
>>>>
>>>> What NFS-RDMA release are you using? This looks like release 7.
>>>>
>>>
>>> Yes. I'm using release 7
>>>
>>>> Is this reproducible?
>>>
>>> I ran into it twice - I think that it may co-relate to openSM restart 
>>> incident. I'll double check it and confirm
>>>
>>>
>>>> What kernel version are you using?
>>>
>>> 2.6.18.5
>>>
>>>> What hardware is this on? It looks like x86-64 to me, which is fine. 
>>>> I just want to be sure I know what I'm looking at. As many specifics 
>>>> as possible is good (number of CPUs, hyperthreading, etc.)
>>>>
>>>
>>> Dual woodcrest xeon based CPUs
>>>
>>>> Could you send the output of
>>>> objdump -Slr /path/to/kernel/mm/swap.o
>>>>
>>>
>>> I attached the objdump output here
>>>
>>>> Actually, just the put_page disassembly is all I want to see.
>>>>
>>>> Is there any more text available? Usually there is an explanation 
>>>> given for an oops message (e.g. "Unable to handle kernel paging 
>>>> request..").
>>>>
>>>
>>> I did not see any oops text message. System was still responsive with 
>>> ipoib ping or login
>>>
>>>
>>>> I opened a bug at the NFS-RDMA SourceForge project to track this:
>>>>
>>>> http://sourceforge.net/tracker/index.php?func=detail&aid=1613201&group_id=97628&atid=618583 
>>>>
>>>
>>> thanks for your help,
>>>
>>> -vu
>>>
>>>> Thanks for reporting this.
>>>> james
>>>>
>>>> On Fri, 8 Dec 2006, Vu Pham wrote:
>>>>
>>>>> Hi James,
>>>>>   I got these errors in server's /var/log/messages and then the 
>>>>> server stop
>>>>> responding to login, I/O...; however, the server is still up, ipoib 
>>>>> is still
>>>>> working
>>>>>
>>>>>
>>>>> Dec  8 06:38:21 ibd201 kernel: RIP: 0010:[<ffffffff8025dff7>]
>>>>> [<ffffffff8025dff7>] put_page+0x17/0x40
>>>>> Dec  8 06:38:21 ibd201 kernel: RSP: 0018:ffff810219ddfb08  EFLAGS: 
>>>>> 00010246
>>>>> Dec  8 06:38:21 ibd201 kernel: RAX: 0000000000000000 RBX: 
>>>>> 0000000000000001
>>>>> RCX: 000000000003ffff
>>>>> Dec  8 06:38:21 ibd201 kernel: RDX: 0000000000000000 RSI: 
>>>>> 0000000000000001
>>>>> RDI: ffff8102274e92f8
>>>>> Dec  8 06:38:21 ibd201 kernel: RBP: ffff8101ab785000 R08: 
>>>>> 0000000000000034
>>>>> R09: 0000000000000000
>>>>> Dec  8 06:38:21 ibd201 kernel: R10: 0000000000000000 R11: 
>>>>> 0000000000000000
>>>>> R12: ffff81020ef96800
>>>>> Dec  8 06:38:21 ibd201 kernel: R13: ffff8101ab785000 R14: 
>>>>> 0000000000000000
>>>>> R15: ffff8102053ee890
>>>>> Dec  8 06:38:21 ibd201 kernel: FS:  00002ad76b8acb00(0000)
>>>>> GS:ffff81022066eb40(0000) knlGS:0000000000000000
>>>>> Dec  8 06:38:21 ibd201 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
>>>>> 000000008005003b
>>>>> Dec  8 06:38:21 ibd201 kernel: CR2: 00002aaaaabf1000 CR3: 
>>>>> 000000021c22b000
>>>>> CR4: 00000000000006e0
>>>>> Dec  8 06:38:21 ibd201 kernel: Process nfsd (pid: 15038, threadinfo
>>>>> ffff810219dde000, task ffff81020d87f0c0)
>>>>> Dec  8 06:38:21 ibd201 kernel: Stack:  ffffffff8835e547 
>>>>> ffff81020ef96968
>>>>> ffff81020ef96800 ffff81020ef96958
>>>>> Dec  8 06:38:21 ibd201 kernel:  ffffffff88360c72 000000010395dc90
>>>>> ffffffff80424e05 0000000000000000
>>>>> Dec  8 06:38:21 ibd201 kernel:  0000000000200200 000000010395dc90
>>>>> ffffffff80239b90 ffff81020d87f0c0
>>>>> Dec  8 06:38:21 ibd201 kernel: Call Trace:
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8835e547>]
>>>>> :sunrpc:svc_rdma_put_context+0x37/0xd0
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff88360c72>]
>>>>> :sunrpc:svc_rdma_recvfrom+0x5a2/0x11e0
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80424e05>]
>>>>> schedule_timeout+0x95/0xb0
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80239b90>] 
>>>>> process_timeout+0x0/0x10
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80423c2d>]
>>>>> wait_for_completion_timeout+0xcd/0x150
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>]
>>>>> default_wake_function+0x0/0x10
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff881c1402>]
>>>>> :ib_mthca:mthca_cmd_post+0x232/0x260
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>]
>>>>> default_wake_function+0x0/0x10
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff802fac39>] 
>>>>> __next_cpu+0x19/0x30
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80227dae>]
>>>>> find_busiest_group+0x24e/0x6d0
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80424772>] 
>>>>> thread_return+0x0/0xde
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff804263f8>]
>>>>> _spin_unlock_irqrestore+0x8/0x10
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8023a331>]
>>>>> try_to_del_timer_sync+0x51/0x60
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8023a34c>] 
>>>>> del_timer_sync+0xc/0x20
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80424e05>]
>>>>> schedule_timeout+0x95/0xb0
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883559e6>]
>>>>> :sunrpc:svc_recv+0x416/0x510
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>]
>>>>> default_wake_function+0x0/0x10
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>]
>>>>> default_wake_function+0x0/0x10
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9540>] 
>>>>> :nfsd:nfsd+0x0/0x380
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9651>] 
>>>>> :nfsd:nfsd+0x111/0x380
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8020ab9c>] 
>>>>> child_rip+0xa/0x12
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9540>] 
>>>>> :nfsd:nfsd+0x0/0x380
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9540>] 
>>>>> :nfsd:nfsd+0x0/0x380
>>>>> Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8020ab92>] 
>>>>> child_rip+0x0/0x12
>>>>> Dec  8 06:38:21 ibd201 kernel:
>>>>> Dec  8 06:38:21 ibd201 kernel:
>>>>> Dec  8 06:38:21 ibd201 kernel: Code: 0f 0b 68 8c 41 45 80 c2 2c 01 
>>>>> f0 ff 4f 08
>>>>> 0f 94 c0 84 c0 74
>>>>> Dec  8 06:38:21 ibd201 kernel: RIP  [<ffffffff8025dff7>] 
>>>>> put_page+0x17/0x40
>>>>> Dec  8 06:38:21 ibd201 kernel:  RSP <ffff810219ddfb08>
>>>>>
>>>>> -vu
>>>>>
>>>
>>>
> 
> ------------------------------------------------------------------------
> 
> <snip>
> 
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81012596b800, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81012596bc00, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17000, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17800, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17c00, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de000, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de800, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7dec00, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39000, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39800, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39c00, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf000, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: ----------- [cut here ] --------- [please bite here ] ---------
> Dec 12 01:09:30 ibd202 kernel: Kernel BUG at include/linux/mm.h:300
> Dec 12 01:09:30 ibd202 kernel: invalid opcode: 0000 [1] SMP 
> Dec 12 01:09:30 ibd202 kernel: CPU 1 
> Dec 12 01:09:30 ibd202 kernel: Modules linked in: nfsd exportfs lockd nfs_acl ipv6 autofs4 sunrpc rdma_cm ib_addr dm_mirror dm_mod button battery asus_acpi ac uhci_hcd ehci_hcd i2c_i801 i2c_core ib_mthca shpchp ib_ipoib ib_umad ib_ucm ib_uverbs ib_cm ib_sa ib_mad ib_core e1000 floppy ext3 jbd megaraid_sas sd_mod scsi_mod
> Dec 12 01:09:30 ibd202 kernel: Pid: 4343, comm: nfsd Not tainted 2.6.18.5 #1
> Dec 12 01:09:30 ibd202 kernel: RIP: 0010:[<ffffffff8025892b>]  [<ffffffff8025892b>] put_page+0x13/0x2e
> Dec 12 01:09:30 ibd202 kernel: RSP: 0018:ffff81023fd11b08  EFLAGS: 00010246
> Dec 12 01:09:30 ibd202 kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000006a53
> Dec 12 01:09:30 ibd202 kernel: RDX: 00000000ffffff01 RSI: 0000000000000001 RDI: ffff81024fc3dec0
> Dec 12 01:09:30 ibd202 kernel: RBP: ffff81023e4cf400 R08: 0000000000000001 R09: 0000000000000000
> Dec 12 01:09:30 ibd202 kernel: R10: 0000000000000000 R11: ffffffff88185ac8 R12: ffff810240fb3800
> Dec 12 01:09:30 ibd202 kernel: R13: ffff810240fb3800 R14: ffff81023d045400 R15: 00000000000dbba0
> Dec 12 01:09:30 ibd202 kernel: FS:  00002ad030296b00(0000) GS:ffff81024688eac0(0000) knlGS:0000000000000000
> Dec 12 01:09:30 ibd202 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Dec 12 01:09:30 ibd202 kernel: CR2: 00002b70add7aad8 CR3: 000000023ebd3000 CR4: 00000000000006e0
> Dec 12 01:09:30 ibd202 kernel: Process nfsd (pid: 4343, threadinfo ffff81023fd10000, task ffff810246562840)
> Dec 12 01:09:30 ibd202 kernel: Stack:  ffffffff8817b2fb ffff810240fb39b8 0000000000000000 ffff81024172c5b0
> Dec 12 01:09:30 ibd202 kernel:  ffffffff8817ec67 ffff81023cda7000 ffffffff8817b2a8 0000000000000000
> Dec 12 01:09:30 ibd202 kernel:  ffff81023fd11ca0 ffff81023fd11b80 0000000000000001 ffff81023cda7000
> Dec 12 01:09:30 ibd202 kernel: Call Trace:
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817b2fb>] :sunrpc:svc_rdma_put_context+0x37/0xb5
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817ec67>] :sunrpc:svc_rdma_recvfrom+0x58f/0x1150
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817b2a8>] :sunrpc:svc_rdma_get_context+0x10c/0x128
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817d5b8>] :sunrpc:send_write+0x200/0x22c
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80254954>] generic_file_readv+0x8e/0xa7
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8025ba92>] zone_statistics+0x40/0x70
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80224401>] find_busiest_group+0x21f/0x66f
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8042a2e9>] _spin_unlock_irq+0x6/0xa
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff804285a3>] thread_return+0x64/0xec
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8042a259>] _spin_lock_irqsave+0x9/0xe
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233574>] lock_timer_base+0x1b/0x3c
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233776>] try_to_del_timer_sync+0x4a/0x51
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233789>] del_timer_sync+0xc/0x16
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80428f6a>] schedule_timeout+0x92/0xad
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff88174070>] :sunrpc:svc_recv+0x3c5/0x4be
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80225264>] default_wake_function+0x0/0xe
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80225264>] default_wake_function+0x0/0xe
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff88204407>] :nfsd:nfsd+0x10d/0x359
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8020a4ac>] child_rip+0xa/0x12
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8020a4a2>] child_rip+0x0/0x12
> Dec 12 01:09:30 ibd202 kernel: 
> Dec 12 01:09:30 ibd202 kernel: 
> Dec 12 01:09:30 ibd202 kernel: Code: 0f 0b 68 16 4d 45 80 c2 2c 01 f0 ff 4f 08 0f 94 c0 84 c0 74 
> Dec 12 01:09:30 ibd202 kernel: RIP  [<ffffffff8025892b>] put_page+0x13/0x2e
> Dec 12 01:09:30 ibd202 kernel:  RSP <ffff81023fd11b08>
> Dec 12 01:09:30 ibd202 kernel:  <4>nfsd: terminating on error 22
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81012596b800, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81012596bc00, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17000, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17800, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff810144c17c00, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de000, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7de800, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e7dec00, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39000, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39800, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023dd39c00, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf000, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: svcrdma: bad WR completion
> Dec 12 01:09:30 ibd202 kernel: 	ctxt=ffff81023e4cf400, count=1 on xprt=ffff810240fb3800, rqstp=ffff81023d045400, status=5
> Dec 12 01:09:30 ibd202 kernel: ----------- [cut here ] --------- [please bite here ] ---------
> Dec 12 01:09:30 ibd202 kernel: Kernel BUG at include/linux/mm.h:300
> Dec 12 01:09:30 ibd202 kernel: invalid opcode: 0000 [1] SMP 
> Dec 12 01:09:30 ibd202 kernel: CPU 1 
> Dec 12 01:09:30 ibd202 kernel: Modules linked in: nfsd exportfs lockd nfs_acl ipv6 autofs4 sunrpc rdma_cm ib_addr dm_mirror dm_mod button battery asus_acpi ac uhci_hcd ehci_hcd i2c_i801 i2c_core ib_mthca shpchp ib_ipoib ib_umad ib_ucm ib_uverbs ib_cm ib_sa ib_mad ib_core e1000 floppy ext3 jbd megaraid_sas sd_mod scsi_mod
> Dec 12 01:09:30 ibd202 kernel: Pid: 4343, comm: nfsd Not tainted 2.6.18.5 #1
> Dec 12 01:09:30 ibd202 kernel: RIP: 0010:[<ffffffff8025892b>]  [<ffffffff8025892b>] put_page+0x13/0x2e
> Dec 12 01:09:30 ibd202 kernel: RSP: 0018:ffff81023fd11b08  EFLAGS: 00010246
> Dec 12 01:09:30 ibd202 kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000006a53
> Dec 12 01:09:30 ibd202 kernel: RDX: 00000000ffffff01 RSI: 0000000000000001 RDI: ffff81024fc3dec0
> Dec 12 01:09:30 ibd202 kernel: RBP: ffff81023e4cf400 R08: 0000000000000001 R09: 0000000000000000
> Dec 12 01:09:30 ibd202 kernel: R10: 0000000000000000 R11: ffffffff88185ac8 R12: ffff810240fb3800
> Dec 12 01:09:30 ibd202 kernel: R13: ffff810240fb3800 R14: ffff81023d045400 R15: 00000000000dbba0
> Dec 12 01:09:30 ibd202 kernel: FS:  00002ad030296b00(0000) GS:ffff81024688eac0(0000) knlGS:0000000000000000
> Dec 12 01:09:30 ibd202 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Dec 12 01:09:30 ibd202 kernel: CR2: 00002b70add7aad8 CR3: 000000023ebd3000 CR4: 00000000000006e0
> Dec 12 01:09:30 ibd202 kernel: Process nfsd (pid: 4343, threadinfo ffff81023fd10000, task ffff810246562840)
> Dec 12 01:09:30 ibd202 kernel: Stack:  ffffffff8817b2fb ffff810240fb39b8 0000000000000000 ffff81024172c5b0
> Dec 12 01:09:30 ibd202 kernel:  ffffffff8817ec67 ffff81023cda7000 ffffffff8817b2a8 0000000000000000
> Dec 12 01:09:30 ibd202 kernel:  ffff81023fd11ca0 ffff81023fd11b80 0000000000000001 ffff81023cda7000
> Dec 12 01:09:30 ibd202 kernel: Call Trace:
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817b2fb>] :sunrpc:svc_rdma_put_context+0x37/0xb5
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817ec67>] :sunrpc:svc_rdma_recvfrom+0x58f/0x1150
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817b2a8>] :sunrpc:svc_rdma_get_context+0x10c/0x128
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8817d5b8>] :sunrpc:send_write+0x200/0x22c
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80254954>] generic_file_readv+0x8e/0xa7
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8025ba92>] zone_statistics+0x40/0x70
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80224401>] find_busiest_group+0x21f/0x66f
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8042a2e9>] _spin_unlock_irq+0x6/0xa
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff804285a3>] thread_return+0x64/0xec
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8042a259>] _spin_lock_irqsave+0x9/0xe
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233574>] lock_timer_base+0x1b/0x3c
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233776>] try_to_del_timer_sync+0x4a/0x51
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80233789>] del_timer_sync+0xc/0x16
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80428f6a>] schedule_timeout+0x92/0xad
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff88174070>] :sunrpc:svc_recv+0x3c5/0x4be
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80225264>] default_wake_function+0x0/0xe
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff80225264>] default_wake_function+0x0/0xe
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff88204407>] :nfsd:nfsd+0x10d/0x359
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8020a4ac>] child_rip+0xa/0x12
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff882042fa>] :nfsd:nfsd+0x0/0x359
> Dec 12 01:09:30 ibd202 kernel:  [<ffffffff8020a4a2>] child_rip+0x0/0x12
> Dec 12 01:09:30 ibd202 kernel: 
> Dec 12 01:09:30 ibd202 kernel: 
> Dec 12 01:09:30 ibd202 kernel: Code: 0f 0b 68 16 4d 45 80 c2 2c 01 f0 ff 4f 08 0f 94 c0 84 c0 74 
> Dec 12 01:09:30 ibd202 kernel: RIP  [<ffffffff8025892b>] put_page+0x13/0x2e
> Dec 12 01:09:30 ibd202 kernel:  RSP <ffff81023fd11b08>
> Dec 12 01:09:30 ibd202 kernel:  <4>nfsd: terminating on error 22
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: messages.202.1
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20061212/a4e6e983/attachment.ksh>


More information about the general mailing list