[Fwd: Re: [ofa-general] [NFS/RDMA] Can't mount NFS/RDMA partition]]

Vu Pham vuhuong at mellanox.com
Fri Apr 24 10:54:55 PDT 2009


Hi Celine,

What HCA do you have on your system? Is it ConnectX? If yes, what is its 
firmware version?

-vu

> Hey Celine,
>
> Thanks for gathering all this info!  So the rdma connections work fine 
> with everything _but_ nfsrdma.  And errno 103 indicates the connection 
> was aborted, maybe by the server (since no failures are logged by the 
> client).
>
>
> More below:
>
>
> Celine Bourde wrote:
>> Hi Steve,
>>
>> This email summarizes the situation:
>>
>> Standard mount -> OK
>> ---------------------
>>
>> [root at twind ~]# mount -o rw 192.168.0.215:/vol0 /mnt/
>> Command works fine.
>>
>> rdma mount -> KO
>> -----------------
>>
>> [root at twind ~]# mount -o rdma,port=2050 192.168.0.215:/vol0 /mnt/
>> Command blocks ! I should perform Ctr+C to kill process.
>>
>> or
>>
>> [root at twind ofa_kernel-1.4.1]# strace mount.nfs 192.168.0.215:/vol0 
>> /mnt/ -o rdma,port=2050
>> [..]
>> fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
>> connect(3, {sa_family=AF_INET, sin_port=htons(610), 
>> sin_addr=inet_addr("127.0.0.1")}, 16) = 0
>> fcntl(3, F_SETFL, O_RDWR)               = 0
>> sendto(3, 
>> "-3\245\357\0\0\0\0\0\0\0\2\0\1\206\270\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0"..., 
>> 40, 0, {sa_family=AF_INET, sin_port=htons(610), 
>> sin_addr=inet_addr("127.0.0.1")}, 16) = 40
>> poll([{fd=3, events=POLLIN}], 1, 3000)  = 1 ([{fd=3, revents=POLLIN}])
>> recvfrom(3, "-3\245\357\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 
>> 8800, MSG_DONTWAIT, {sa_family=AF_INET, sin_port=htons(610), 
>> sin_addr=inet_addr("127.0.0.1")}, [16]) = 24
>> close(3)                                = 0
>> mount("192.168.0.215:/vol0", "/mnt", "nfs", 0, 
>> "rdma,port=2050,addr=192.168.0.215"
>> ..same problem
>>
>> [root at twind tmp]# dmesg
>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 
>> 32 ird 16
>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 
>> 32 ird 16
>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 
>> 32 ird 16
>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 
>> 32 ird 16
>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>>
>>
>
> Is there anything logged on the server side?
>
> Also, can you try this again, but on both systems do this before 
> attempting the mount:
>
> echo 32768 > /proc/sys/sunrpc/rpc_debug
>
> This will enable all the rpc trace points and add a bunch of logging 
> to /var/log/messages.
> Maybe that will show us something.  It think the server is aborting 
> the connection for some reason.
>
> Steve.
>
>
>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list