[ofa-general] Bogus Receive Completions
Roman Kononov
kononov at dls.net
Wed Jan 23 16:36:23 PST 2008
On 2008-01-23 18:31, Roman Kononov wrote:
> On 2008-01-23 17:32, Roland Dreier wrote:
>> I'd be curious to run it. It can't hurt to have the test...
>
> This is similar to my previous program. The difference is that this one
> makes many (up to 10, in test_create()) sets of SQ+RQ+CQ in struct
> conn_t, which share a single Completion Channel in struct ctx_t. Every
> conn_t has a ring of receive buffers, a ring of send buffers, send
> sequence number, receive sequence number. Every time a buffer is sent,
> just before ibv_post_send() call, the send sequence number is placed
> into the buffer, imm_data and wr_id. Upon Send Completion, wr_id and the
> sent buffer must contain the expected send sequence number. Every time a
> buffer is received, just before ibv_post_recv() call, the receive
> sequence number is placed into wr_id. Upon Receive Completion, wr_id,
> the received buffer and imm_data must contain the expected receive
> sequence number. These 2 "musts" are sometimes violated. In my setup
> assertion fails in lines 303 (receiver) and 287 (sender).
>
> The program has 2 threads. The first one reads the completion channel,
> validates the Send and Receive Completions, issues ibv_post_recv() and
> ibv_post_send(). The second one can only issue ibv_post_send().
>
> The program makes 2 QP. Increasing the number of QP seemly does not
> increase the probability of failure.
>
> The program prints <time: iteration_#_for_QP_#0 iteration_#_for_QP_#1>.
>
> In my setup, it seems that if I run two pairs of the program, the
> failure occurs sooner.
>
> Roman
>
Sorry, I forgot the program...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kink.c
Type: text/x-csrc
Size: 20747 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20080123/a6031c6b/attachment.c>
More information about the general
mailing list