[openib-general] NFS/RDMA for Linux: client and server update release 5

Tom Tucker tom at opengridcomputing.com
Fri May 26 06:46:32 PDT 2006


Helen:

Can you please do the following and send the output to me and Tom
Talpey?

objdump -Sl net/sunrpc/sched.o > objdump.out 

Thanks,

Tom
(Tom Tucker Not Talpey)


On Thu, 2006-05-25 at 10:54 -0700, helen chen wrote:
> Tom,
> 
> Please review the attached ksymoops output.
> 
> Helen
> 
> On Wed, 2006-05-24 at 04:25, Talpey, Thomas wrote:
> > [Cutting down the reply list to more relevant parties...]
> > 
> > It's hard to say what is crashing, but I suspect the CM code, due
> > to the process context being ib_cm. Is there some reason you're
> > not getting symbols in the stack trace? If you could feed this oops
> > text to ksymoops it will give us more information.
> > 
> > In any case, it appears the connection is succeeding at the server,
> > but the client RPC code isn't being signalled that it has done so.
> > Perhaps this is due to a lost reply, but the NFS code hasn't actually
> > started to do anything. So, I would look for IB-level issues. Is the
> > client running the current OpenFabrics svn top-of-tree?
> > 
> > Let's take this offline to diagnose, unless someone has an idea why
> > the CM would be failing. The ksymoops analysis would help.
> > 
> > Tom.
> > 
> > 
> > 
> > At 07:19 PM 5/23/2006, helen chen wrote:
> > >Hi Tom,
> > >
> > >I have downloaded your release 5 of the NFS/RDMA and am having trouble
> > >mounting the rdma nfs, the 
> > >"./nfsrdmamount -o rdma on16-ib:/mnt/rdma /mnt/rdma" command never
> > >returned. and the dmesg for client and server are:
> > >
> > >------ demsg from client -----
> > >RPCRDMA Module Init, register RPC RDMA transport
> > >Defaults:
> > >        MaxRequests 50
> > >        MaxInlineRead 1024
> > >        MaxInlineWrite 1024
> > >        Padding 0
> > >        Memreg 5
> > >RPC: Registered rdma transport module.
> > >RPC: Registered rdma transport module.
> > >RPC:       xprt_setup_rdma: 140.221.134.221:2049
> > >nfs: server on16-ib not responding, timed out
> > >Unable to handle kernel NULL pointer dereference at 0000000000000000
> > >RIP:
> > >[<0000000000000000>]
> > >PGD a9f2b067 PUD a8ca2067 PMD 0
> > >Oops: 0010 [1] PREEMPT SMP
> > >CPU 1
> > >Modules linked in: xprtrdma ib_srp iscsi_tcp scsi_transport_iscsi
> > >scsi_mod
> > >Pid: 346, comm: ib_cm/1 Not tainted 2.6.16.16 #4
> > >RIP: 0010:[<0000000000000000>] [<0000000000000000>]
> > >RSP: 0018:ffff8100af5a1c30  EFLAGS: 00010246
> > >RAX: ffff8100aeff2400 RBX: ffff8100aeff2400 RCX: ffff8100afc9e458
> > >RDX: 0000000000000000 RSI: ffff8100af5a1d48 RDI: ffff8100aeff2440
> > >RBP: ffff8100aeff2440 R08: 0000000000000000 R09: 0000000000000000
> > >R10: 0000000000000003 R11: 0000000000000000 R12: ffff8100aeff2500
> > >R13: 00000000ffffff99 R14: ffff8100af5a1d48 R15: ffffffff8036c72c
> > >FS:  0000000000505ae0(0000) GS:ffff810003ce25c0(0000)
> > >knlGS:0000000000000000
> > >CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > >CR2: 0000000000000000 CR3: 00000000ad587000 CR4: 00000000000006a0
> > >Process ib_cm/1 (pid: 346, threadinfo ffff8100af5a0000, task
> > >ffff8100afea8100)
> > >Stack: ffffffff8802a331 ffff8100aeff2500 0000000000000001
> > >ffff8100aeff2440
> > >       ffffffff804011fd 0000000000000000 ffffffff8802a343
> > >ffff8100afdd6100
> > >       ffffffff80364ee4 0000000000000100
> > >Call Trace: [<ffffffff8802a331>] [<ffffffff804011fd>]
> > >       [<ffffffff8802a343>] [<ffffffff80364ee4>] [<ffffffff80364341>]
> > >       [<ffffffff8036f85c>] [<ffffffff8036fcf2>] [<ffffffff8036baeb>]
> > >       [<ffffffff8036bdc1>] [<ffffffff8036d6fe>] [<ffffffff8036c72c>]
> > >       [<ffffffff801377b4>] [<ffffffff801377fb>] [<ffffffff8013a960>]
> > >       [<ffffffff80137900>] [<ffffffff8012309b>] [<ffffffff8013a960>]
> > >       [<ffffffff8012309b>] [<ffffffff8013a960>] [<ffffffff8013a937>]
> > >       [<ffffffff8010b8d6>] [<ffffffff8013a960>] [<ffffffff801160b9>]
> > >       [<ffffffff801160b9>] [<ffffffff801160b9>] [<ffffffff8013a86f>]
> > >       [<ffffffff8010b8ce>]
> > >
> > >Code:  Bad RIP value.
> > >RIP [<0000000000000000>] RSP <ffff8100af5a1c30>
> > >CR2: 0000000000000000
> > >
> > >------dmesg from server ------
> > >nfsd: request from insecure port 140.221.134.220, port=32768!
> > >svc_rdma_recvfrom: transport ffff81007e8f2800 is closing
> > >svc_rdma_put: Destroying transport ffff81007e8f2800,
> > >cm_id=ffff81007e945200, sk_flags=154, sk_inuse=0
> > >
> > >Did I forget to configure necessary components into my kernel?
> > >
> > >Thanks,
> > >Helen
> > >
> > >On Mon, 2006-05-22 at 13:25, Talpey, Thomas wrote:
> > >> Network Appliance is pleased to announce release 5 of the NFS/RDMA
> > >> client and server for Linux 2.6.16.16. This update to the April 19 release
> > >> adds improved server parallel performance and fixes various issues. This
> > >> code supports both Infiniband and iWARP transports.
> > >> 
> > >> <http://sourceforge.net/projects/nfs-rdma/>
> > >> 
> > >> 
> > ><http://sourceforge.net/project/showfiles.php?group_id=97628&package_id=191427>
> > >> 
> > >> Comments and feedback welcome. We're especially interested in
> > >> successful test reports! Thanks.
> > >> 
> > >> Tom Talpey, for the various NFS/RDMA projects.
> > >> 
> > >> _______________________________________________
> > >> openib-general mailing list
> > >> openib-general at openib.org
> > >> http://openib.org/mailman/listinfo/openib-general
> > >> 
> > >> To unsubscribe, please visit 
> > >http://openib.org/mailman/listinfo/openib-general
> > >> 
> > 
> > 
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list