[ofw] Re: Completion with bad status: IBV_WC_EXC_RETRY_EXC_ERROR

Diego Guella diego.guella at sircomtech.com
Wed Nov 21 06:10:54 PST 2007


Hi Fab,
Thanks for your answer.
Please see my replies inline.


----- Original Message ----- 
From: "Fab Tillier" <ftillier at windows.microsoft.com>
>

>When you exchange the rkey, are you keeping track of endianness?  The 
>Windows drivers treat rkeys in network order.  I think the Linux stack 
> >does this in host order, and this could cause your problems.  I would have 
>expected a different error than a retry exceeded error, though.

No, I didn't change endianness of the rkey.
So I made a test changing endianness of the rkey, but the error is always 
the same.
I too would have expected a different error, say a IB_WCS_REM_ACCESS_ERR, 
instead of this retry exceeded.

>For the LIDs, you need to swap it on the Windows side, not the Linux side - 
>this could be the cause for the retry error.
You said (or perhaps Tzachi said) that Windows treats the LID in network 
order.
So in my "CM" protocol I am exchanging the LID in network order: Windows 
sends (and receives) the LID _as is_, while Linux sends it applying ntohs 
before the send (and applying htons after receive).

>Is there any reason you don't use the IB CM or RDMA CM for connection 
>establishment?  On the Windows side, you'll need to deal with the >RDMA CM 
>private data format yourself, but at least it will take care of the QP 
>settings for you.

I have taken the example in WinIB 1.3, and slightly modified it (removed 
some parts and added support to RDMA READ/WRITE tests).
This program works well in a Windows/Windows test.
Then I ported this program to Linux, modified again to use verbs instead of 
ib_al. It works well in a Linux/Linux test.
The problem only arises when I try to use a Windows daemon and a Linux 
client, and vice versa.

I posted the source code of this programs in older emails, I can resend it 
to you if you wish.



Thanks,
Diego




More information about the ofw mailing list