[Openib-windows] [PATCH] MTHCA: always treat RKEY in network order

Sun Apr 2 12:09:14 PDT 2006

Once more, I think, we need this patch, just it must include both rkey
and lkey and be checked good. 
I didn't try WSD yet. I'll wait for your results

> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com] 
> On Behalf Of Fabian Tillier
> Sent: Sunday, April 02, 2006 7:09 PM
> To: Leonid Keller
> Cc: openib-windows at openib.org
> Subject: Re: [Openib-windows] [PATCH] MTHCA: always treat 
> RKEY in network order
> 
> Hi Leonid,
> 
> On 4/2/06, Leonid Keller <leonid at mellanox.co.il> wrote:
> > I didn't study your patch profoundly, but I have a feeling that it 
> > doesn't fix anything.
> > I mean, it's true that the driver mistakenly returns rkey 
> and lkey in 
> > LE upon MR creation, but it then mistakenly :) converts 
> them to BE on 
> > post send request. So it is to be fixed carefully, yes, but 
> it seems 
> > not to solve any problem.
> 
> The problem comes from the fact that the RKEY is returned in 
> host order to the client, who is responsible for sharing it 
> with the remote peer.  If a client sends it to the remote 
> side which is using a driver that does not swap the RKEY in 
> an RDMA request (the MT23108 driver for example), the RKEY 
> will be wrong on the wire.  The current implementation 
> requires that the client know that this opaque value must be 
> byteswapped.  Thus, the MTHCA driver will only currently work 
> in a homogeneous environment.
> 
> If you try your QP test with one host using the old MT23108 
> driver, and the other the MTHCA driver, the test should fail 
> due to RKEY endian issues during the RKEY exchange.
> 
> > I've tried the current code - without your patch - with our 
> test. All 
> > worked OK.
> 
> Have you had a chance to try it with the patch to see if the 
> behavior changes?
> 
> > Here is one example:
> >        qp_test.exe --daemon
> >        qp_test --ip=10.4.3.27 --qp=5 --oust=100 --thread=4 -i=10000 
> > --grh CLIENT RR 5 10240 SERVER RW 5 10240 CLIENT RWI 5 10240 The 
> > latter means:
> >        The client part creates 4 threads, 5 QPs in each thread and 
> > execute 3 operations, 10000 iterations each:
> >                RDMA read as client with 5 segments of 10K each;
> >                RDMA write as server with 5 segments of 10K each;
> >                RDMA write with immediate as client with 5 
> segments of 
> > 10K each;
> >        There are 100 outstanding requests on every QP and 
> packets are 
> > sent with GRH header.
> >        The test worked in polling mode, transport was RC (it also 
> > worked with UC).
> >
> > It was tried on x64 machine with Artavor (Arbel in Tavor mode).
> >
> > With what parameters do you use RDMA in WSD ?
> 
> For WSD, there will be at most one RDMA in flight per QP.  I 
> don't know if this is per queue or queue pair.  WSD will use 
> RDMA read if possible, unless there's no RDMA read entrypoint 
> into the provider, in which case it uses RDMA write.
> 
> There aren't more than 4 SGEs, depending on how many WSABUF 
> structures the client passes into the send/recv calls.  
> Usually, it will be just a single SGE.
> 
> Have you had a chance to test WSD over MTHCA?  I'll back out 
> my RKEY patch and see if I can get WSD to work in a 
> homogeneous MTHCA environment.  If that works, then it's 
> clear my patch is broken.  I'll let you know how it goes.
> 
> - Fab
> 
> 
>