[openib-general] nfsrdma server stop responding,

Vu Pham vu at mellanox.com
Fri Dec 8 12:10:43 PST 2006


Hi James,
   I got these errors in server's /var/log/messages and then the server 
stop responding to login, I/O...; however, the server is still up, ipoib 
is still working


Dec  8 06:38:21 ibd201 kernel: RIP: 0010:[<ffffffff8025dff7>]  
[<ffffffff8025dff7>] put_page+0x17/0x40
Dec  8 06:38:21 ibd201 kernel: RSP: 0018:ffff810219ddfb08  EFLAGS: 00010246
Dec  8 06:38:21 ibd201 kernel: RAX: 0000000000000000 RBX: 
0000000000000001 RCX: 000000000003ffff
Dec  8 06:38:21 ibd201 kernel: RDX: 0000000000000000 RSI: 
0000000000000001 RDI: ffff8102274e92f8
Dec  8 06:38:21 ibd201 kernel: RBP: ffff8101ab785000 R08: 
0000000000000034 R09: 0000000000000000
Dec  8 06:38:21 ibd201 kernel: R10: 0000000000000000 R11: 
0000000000000000 R12: ffff81020ef96800
Dec  8 06:38:21 ibd201 kernel: R13: ffff8101ab785000 R14: 
0000000000000000 R15: ffff8102053ee890
Dec  8 06:38:21 ibd201 kernel: FS:  00002ad76b8acb00(0000) 
GS:ffff81022066eb40(0000) knlGS:0000000000000000
Dec  8 06:38:21 ibd201 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Dec  8 06:38:21 ibd201 kernel: CR2: 00002aaaaabf1000 CR3: 
000000021c22b000 CR4: 00000000000006e0
Dec  8 06:38:21 ibd201 kernel: Process nfsd (pid: 15038, threadinfo 
ffff810219dde000, task ffff81020d87f0c0)
Dec  8 06:38:21 ibd201 kernel: Stack:  ffffffff8835e547 ffff81020ef96968 
ffff81020ef96800 ffff81020ef96958
Dec  8 06:38:21 ibd201 kernel:  ffffffff88360c72 000000010395dc90 
ffffffff80424e05 0000000000000000
Dec  8 06:38:21 ibd201 kernel:  0000000000200200 000000010395dc90 
ffffffff80239b90 ffff81020d87f0c0
Dec  8 06:38:21 ibd201 kernel: Call Trace:
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8835e547>] 
:sunrpc:svc_rdma_put_context+0x37/0xd0
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff88360c72>] 
:sunrpc:svc_rdma_recvfrom+0x5a2/0x11e0
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80424e05>] 
schedule_timeout+0x95/0xb0
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80239b90>] 
process_timeout+0x0/0x10
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80423c2d>] 
wait_for_completion_timeout+0xcd/0x150
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>] 
default_wake_function+0x0/0x10
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff881c1402>] 
:ib_mthca:mthca_cmd_post+0x232/0x260
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>] 
default_wake_function+0x0/0x10
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff802fac39>] __next_cpu+0x19/0x30
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80227dae>] 
find_busiest_group+0x24e/0x6d0
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80424772>] thread_return+0x0/0xde
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff804263f8>] 
_spin_unlock_irqrestore+0x8/0x10
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8023a331>] 
try_to_del_timer_sync+0x51/0x60
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8023a34c>] del_timer_sync+0xc/0x20
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80424e05>] 
schedule_timeout+0x95/0xb0
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883559e6>] 
:sunrpc:svc_recv+0x416/0x510
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>] 
default_wake_function+0x0/0x10
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff80228db0>] 
default_wake_function+0x0/0x10
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9540>] :nfsd:nfsd+0x0/0x380
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9651>] :nfsd:nfsd+0x111/0x380
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8020ab9c>] child_rip+0xa/0x12
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9540>] :nfsd:nfsd+0x0/0x380
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff883a9540>] :nfsd:nfsd+0x0/0x380
Dec  8 06:38:21 ibd201 kernel:  [<ffffffff8020ab92>] child_rip+0x0/0x12
Dec  8 06:38:21 ibd201 kernel:
Dec  8 06:38:21 ibd201 kernel:
Dec  8 06:38:21 ibd201 kernel: Code: 0f 0b 68 8c 41 45 80 c2 2c 01 f0 ff 
4f 08 0f 94 c0 84 c0 74
Dec  8 06:38:21 ibd201 kernel: RIP  [<ffffffff8025dff7>] put_page+0x17/0x40
Dec  8 06:38:21 ibd201 kernel:  RSP <ffff810219ddfb08>

-vu





More information about the general mailing list