[openib-general] RNR_RETRY_EXC_ERR and completion opcode in "send_lat"

Sayantan Sur surs at cse.ohio-state.edu
Sun Dec 3 11:57:41 PST 2006


Hi Dotan,

Thanks a lot for this information.

Sayantan.

Dotan Barak wrote:
> Hi Sayantan.
> Sayantan Sur wrote:
>
>> Hi,
>>
>> I have a question about the "status" field for a completion which is due
>> to RNR retry exceeded error. I trivially modified the `send_lat' program
>> (from the Gen2 perftest directory) to use SRQ and not post receives
>> after some specified time. Given the "rnr_retry" attribute of the QP not
>> to be 7 (infinite retry), I'm expecting the sender to get an erroneous
>> completion with IBV_WC_RNR_RETRY_EXC_ERR.
>>
>> So far so good ... however, the completion I pull out of the send_cq,
>> lists the opcode of the completion to be IBV_WC_RECV! Is this expected?
>>
>> I am using OFED 1.1 on dual Intel Xeon machines with Mellanox DDR HCAs
>> (two ports) and in MemFree mode. The distribution used is RH AS4 (Nahant
>> Update 3), with kernel version 2.6.17.7.
>>
>> If someone could explain this behavior, or suggest a workaround, it'd be
>> great.
>>
>> TIA,
>> Sayantan.
>>  
>>
> I toke the following text from the man pages that i wrote to the 
> libibverbs:
> "Not all wc attributes are always valid. If the  completion  status  is
>       other  than  IBV_WC_SUCCESS,  only the following attributes are 
> valid:
>       wr_id, status, qp_num, and vendor_err."
>
> In other words, the opcode is not valid if you have a completion with 
> error.
>
> Thanks
> Dotan

-- 
http://www.cse.ohio-state.edu/~surs





More information about the general mailing list