[Fwd: Re: [ofa-general] [NFS/RDMA] Can't mount NFS/RDMA partition]]

Celine Bourde celine.bourde at ext.bull.net
Fri Apr 24 04:13:18 PDT 2009


Hi Steve,

This email summarizes the situation:

Standard mount -> OK
---------------------

[root at twind ~]# mount -o rw 192.168.0.215:/vol0 /mnt/
Command works fine.

rdma mount -> KO
-----------------

[root at twind ~]# mount -o rdma,port=2050 192.168.0.215:/vol0 /mnt/
Command blocks ! I should perform Ctr+C to kill process.

or

[root at twind ofa_kernel-1.4.1]# strace mount.nfs 192.168.0.215:/vol0 /mnt/ -o rdma,port=2050
[..]
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
connect(3, {sa_family=AF_INET, sin_port=htons(610), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
fcntl(3, F_SETFL, O_RDWR)               = 0
sendto(3, "-3\245\357\0\0\0\0\0\0\0\2\0\1\206\270\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0"..., 40, 0, {sa_family=AF_INET, sin_port=htons(610), sin_addr=inet_addr("127.0.0.1")}, 16) = 40
poll([{fd=3, events=POLLIN}], 1, 3000)  = 1 ([{fd=3, revents=POLLIN}])
recvfrom(3, "-3\245\357\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 8800, MSG_DONTWAIT, {sa_family=AF_INET, sin_port=htons(610), sin_addr=inet_addr("127.0.0.1")}, [16]) = 24
close(3)                                = 0
mount("192.168.0.215:/vol0", "/mnt", "nfs", 0, "rdma,port=2050,addr=192.168.0.215"
..same problem

[root at twind tmp]# dmesg
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32 ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)


Rdma cm tests
-------------

* With ib_rdma_bw tool :

[root at twing ~]# ib_rdma_bw -c
4960: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=0 | cma=1 |
4960: Local address:  LID 0000, QPN 000000, PSN 0x24cafe RKey 0x18002400 VAddr 0x007fd3a03da000
4960: Remote address: LID 0000, QPN 000000, PSN 0x5f7a53, RKey 0x20002700 VAddr 0x007fbac1525000

[root at twind ofa_kernel-1.4.1]# ib_rdma_bw -c 192.168.0.215
31739: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=0 | cma=1 |
31739: Local address:  LID 0000, QPN 000000, PSN 0x5f7a53 RKey 0x20002700 VAddr 0x007fbac1525000
31739: Remote address: LID 0000, QPN 000000, PSN 0x24cafe, RKey 0x18002400 VAddr 0x007fd3a03da000
Conflicting CPU frequency values detected: 2667.000000 != 2000.000000
31739: Bandwidth peak (#0 to #569): 0 MB/sec
31739: Bandwidth average: 0 MB/sec
31739: Service Demand peak (#0 to #569): 1949 cycles/KB
31739: Service Demand Avg  : 1949 cycles/KB

* With rping tool :

[root at twing ~]# rping -s
server DISCONNECT EVENT...
wait for RDMA_READ_ADV state 9
cq completion failed status 5

[root at twind ofa_kernel-1.4.1]# rping -Vv -C14 -c -a 192.168.0.215 
ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst
ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu
ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv
ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw
ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx
ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy
ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA
ping data: rdma-ping-10: KLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA
ping data: rdma-ping-11: LMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzAB
ping data: rdma-ping-12: MNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzABC
ping data: rdma-ping-13: NOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzABCD
cq completion failed status 5
client DISCONNECT EVENT...


My configuration :
-------------------

OFED-1.4.1-rc3 modules (ip_ipoib, mlx4_ib, rdma_cm, etc.)

[root at twing ~]# cat /proc/fs/nfsd/portlist
rdma 2050
tcp 2049
udp 2049

[root at twind tmp]# mount.nfs -V
mount.nfs (linux nfs-utils 1.1.6)

[root at twind tmp]# rpm -qf /usr/bin/rping
librdmacm-utils-1.0.8-1.ofed1.4.1.rc3

[root at twind tmp]# rpm -qf  /usr/bin/ib_rdma_bw
perftest-1.2-1.ofed1.4.1.rc3

[root at twind tmp]# uname -ar
Linux twind 2.6.27 #2 SMP Thu Apr 9 18:38:19 CEST 2009 x86_64 x86_64 x86_64 GNU/Linux

[root at twind tmp]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 5.3 Beta (Tikanga)

Celine.


Steve Wise wrote:

> Celine Bourde wrote:
>   
>> Hi,
>>     
>> I've updated nfs-utils package:
>>     
>> [root at my_host ~]# mount.nfs -V
>>     
>> mount.nfs (linux nfs-utils 1.1.6)
>>     
>>     
>>> [root at my_host ~]# strace  mount.nfs 192.168.0.215:/vol0 /mnt/ -o 
>>> rdma,port=2050
>>>       
>>> Does it work without rdma?
>>>       
>> The problem is exactly the same without rdma:
>>     
>> [root at my_host ~]# strace  mount.nfs 192.168.0.215:/vol0 /mnt/ -o 
>> rw,port=2050
>>     
>> [..]
>>     
> You cannot use port 2050 for tcp mounts.  So remove the 'port=2050' and 
> it will attempt a tcp mount to port 2049.
>   
> Steve.
>   
>   
>   



More information about the general mailing list