[ofa-general] IBV_WC_RETRY_EXC_ERR causes

Krishnamoorthy, Sriram sriram at pnl.gov
Thu Jun 19 18:42:47 PDT 2008


Can someone please explain what can cause IBV_WC_RETRY_EXC_ERR? I am
using a combination of send-receive and RDMA. I have the reliable
connection queue pairs initialized as:

    qp_attr.timeout             = 18;
    qp_attr.retry_cnt           = 7;
    qp_attr.rnr_retry           = 7;

>From the documentation, I assumed a value of 7 meant infinite retry. Can
lack of receive buffers cause this error? I understand
IBV_WC_RNR_RETRY_EXC_ERR to be the error caused by lack of receive
buffers. Could it be congestion in the network?

I could not find much from earlier queries related to this error.  It
often occurs in the middle of the computation on large (>=1024)
processor counts, when I try to have multiple outstanding send-recvs
between a pair of processes (each pair of processes has a RC queue pair
initialized). I do not have a small test case yet that can repeat this
error.

Thanks for any help,
Sriram.K


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20080619/fe681842/attachment.html>


More information about the general mailing list