[Fwd: Re: [ofa-general] [NFS/RDMA] Can't mount NFS/RDMA partition]]

Joe Landman landman at scalableinformatics.com
Wed May 6 15:26:40 PDT 2009


Vu Pham wrote:
> Hi Celine,
> 
> What HCA do you have on your system? Is it ConnectX? If yes, what is its 
> firmware version?

I am seeing this also on a server with ConnectX and a client with mthca.

My mount hangs:

  /sbin/mount.nfs 10.1.1.2:/data /data -o rdma,intr,port=2050
^C

Leaving this in the logs:

May  6 18:14:03 dv3 kernel: [ 9997.015209] rpcrdma: connection to 
10.1.1.2:2050 on mthca0, memreg 6 slots 32 ird 4
May  6 18:14:03 dv3 kernel: [ 9997.015582] rpcrdma: connection to 
10.1.1.2:2050 closed (-103)

rdma seems to work

root at dv3:~# ib_rdma_bw -b -i 2
6222: | port=18515 | ib_port=2 | size=65536 | tx_depth=100 | iters=1000 
| duplex=1 | cma=0 |
6222: Local address:  ...
6222: Remote address: ...

6222: Bandwidth peak (#0 to #245): 1765.83 MB/sec
6222: Bandwidth average: 1724.45 MB/sec
6222: Service Demand peak (#0 to #245): 884 cycles/KB
6222: Service Demand Avg  : 906 cycles/KB

root at dv3:~# showmount -e 10.1.1.2
Export list for 10.1.1.2:
/data *

On the server side, I see

May  6 14:07:53 jr4 mountd[5673]: authenticated mount request from 
10.1.1.1:940 for /data (/data)

On server for rping
[
root at jr4 ~]# rping -s
cq completion failed status 4
wait for RDMA_READ_COMPLETE state 10

on the client side for rping

root at dv3:~# rping -S 100 -d -v -c -a 10.1.1.2
verbose
client
created cm_id 0x606690
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x606690 (parent)
cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x606690 (parent)
rdma_resolve_addr - rdma_resolve_route successful
created pd 0x608be0
created channel 0x6068c0
created cq 0x608c30
created qp 0x608d50
rping_setup_buffers called on cb 0x605010
allocated & registered buffers...
cq_thread started.
cma_event type RDMA_CM_EVENT_ESTABLISHED cma_id 0x606690 (parent)
ESTABLISHED
rmda_connect successful
RDMA addr 60a8d0 rkey 116003d len 100
send completion
cma_event type RDMA_CM_EVENT_DISCONNECTED cma_id 0x606690 (parent)
client DISCONNECT EVENT...
wait for RDMA_WRITE_ADV state 6
cq completion failed status 5
rping_free_buffers called on cb 0x605010
destroy cm_id 0x606690

Any hints on the 103 error?  I have 2.6.000 firmware on the ConnectX.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615



More information about the general mailing list