[ofw] RE: Completion with bad status: IBV_WC_RETRY_EXC_ERROR
Dotan Barak
dotanb at mellanox.co.il
Wed Nov 14 00:57:42 PST 2007
Hi.
I checked your code in Linux and you have a failure in RDMA Read...
But for your problem: in windows the LID of the port is in network
order and in Linux the LID of the port is in host order.
(order means endianess).
Fixing this should solve the problem ...
Dotan
________________________________
From: Diego Guella [mailto:diego.guella at sircomtech.com]
Sent: Wednesday, November 14, 2007 10:29 AM
To: Dotan Barak; Tzachi Dar; Fab Tillier
Cc: ofw at lists.openfabrics.org
Subject: Re: Completion with bad status: IBV_WC_RETRY_EXC_ERROR
Hi Dotan,
I apologize for that silly mistake, obviously WR_SEND is
different from IBV_WR_SEND, and the same was for WR_RDMA_READ, etc.
etc...
So, I removed <iba/ib_types.h> from the includes, to make sure I
don't use them.
Now the Linux program works with send/recv and rdma to himself
(daemon/client on the same machine), but I still get the same error when
I try communication between Windows/Linux.
The error applies to SEND, RDMA_WRITE, RDMA_READ, and using the
daemon both on Linux or Windows.
Attached are the new sources (note that Windows sources aren't
changed).
Thanks,
Diego
----- Original Message -----
From: Dotan Barak <mailto:dotanb at mellanox.co.il>
To: Diego Guella <mailto:diego.guella at sircomtech.com> ;
Tzachi Dar <mailto:tzachid at mellanox.co.il> ; Fab Tillier
<mailto:ftillier at windows.microsoft.com>
Cc: ofw at lists.openfabrics.org
Sent: Tuesday, November 13, 2007 3:13 PM
Subject: RE: Completion with bad status:
IBV_WC_RETRY_EXC_ERROR
o.k., I passed that and the code compilation stage ....
Just a small note:
In the modify_qp: qp_access_flags is only for supported
remote operations, so IBV_ACCESS_LOCAL_WRITE should be removed
I understand what the root cause of the problem is: you
took a code from windows and moved only PART of the code to Linux.
For example: WR_SEND is defined in iba/ib_types and has
the value 1 (which is RDMA WRITE_WITH_IMM in Linux)
so actually, you did RDMA write with rkey and remote
address with undefined values.
(this is only an example for the corruption that happens
during the test execution because of this issue)
The code passed compilation because you included
iba/ib_types.h (in types.h) in Linux too.
This file should not be included in Linux (unless you
really need it ...)
All of the structures, functions, enumerations in Linux
verbs start with IBV_ or ibv_.
This should fix the test problems (I hope ...)
thanks
Dotan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20071114/ce707437/attachment.html>
More information about the ofw
mailing list