[ewg] nfsrdma fails to write big file,

Tom Tucker tom at opengridcomputing.com
Mon Feb 22 10:49:24 PST 2010


Vu Pham wrote:
> Setup: 
> 1. linux nfsrdma client/server with OFED-1.5.1-20100217-0600, ConnectX2
> QDR HCAs fw 2.7.8-6, RHEL 5.2.
> 2. Solaris nfsrdma server svn 130, ConnectX QDR HCA.
>
>
> Running vdbench on 10g file or *dd if=/dev/zero of=10g_file bs=1M
> count=10000*, operation fail, connection get drop, client cannot
> re-establish connection to server.
> After rebooting only the client, I can mount again.
>
> It happens with both solaris and linux nfsrdma servers.
>
> For linux client/server, I run memreg=5 (FRMR), I don't see problem with
> memreg=6 (global dma key)
>
>   

Awesome. This is the key I think.

Thanks for the info Vu,
Tom


> On Solaris server snv 130, we see problem decoding write request of 32K.
> The client send two read chunks (32K & 16-byte), the server fail to do
> rdma read on the 16-byte chunk (cqe.status = 10 ie.
> IB_WC_REM_ACCCESS_ERROR); therefore, server terminate the connection. We
> don't see this problem on nfs version 3 on Solaris. Solaris server run
> normal memory registration mode.
>
> On linux client, I see cqe.status = 12 ie. IB_WC_RETRY_EXC_ERR
>
> I added these notes in bug #1919 (bugs.openfabrics.org) to track the
> issue.
>
> thanks,
> -vu
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>   




More information about the ewg mailing list