[ofa-general] Bogus Receive Completions
Roman Kononov
kononov at dls.net
Tue Dec 4 18:06:34 PST 2007
Hello all,
I have weird behavior of libibverbs + libmthca, which makes me suspicious
about either libmthca or the HCA firmware.
The source of the test code is attached.
I have a pair of processes talking to each other. They are very similar to
ibv_rc_pingpong. The difference is that my processes issue
ibv_post_send(IBV_WR_RDMA_WRITE_WITH_IMM), and they try to keep several
(namely 4) outstanding Receive and Send Work Requests.
All Send Work Requests are sequentially numbered. The number is placed into
the wr_id and imm_data fields. When the process receives a Send Work
Completion, wr_id is checked for consistency with the sent numbers, and the
next Send Work Request is posted (RDMA Write with IMM, 32 bytes of
out-of-line data). So far so good.
All Receive Work Requests are sequentially numbered as well. The number is
placed into the wr_id field. When the process gets a Receive Work
Completion, it checks both the wr_id and imm_data for consistency with the
expected numbers, and posts the next Receive Work Request. The consistency
test eventually fails (after a few hundred thousand iterations - a few
seconds). The Completion status is "success", wr_id is out of order,
imm_data is in order. Despite inconsistency, the process still tries to post
the next Receive Work Request, which fails as if the Receive Queue were full
(I modified libmthca's mthca_tavor_post_recv() to return distinct error
codes). All subsequent Receive Work Completions fail the consistency test
and ibv_post_recv() fail in the same manner. Then everything stops waiting
for Work Completions inside ibv_get_cq_event().
I believe that, in this test, wr_id from Receive Work Completions must
arrive in order, but they do not.
I am sure that "queue overflow" failures of ibv_post_recv() are illegal
because I keep the queue no more than half-full.
The test fails with libibverbs-1.0.5 (and older), libmthca-1.0.4 (and older).
~>uname -a
Linux node02 2.6.23.9 #1 SMP PREEMPT Fri Nov 30 21:23:11 CST 2007 x86_64
x86_64 x86_64 GNU/Linux
~>grep 'model name' /proc/cpuinfo
model name : Dual Core AMD Opteron(tm) Processor 285
model name : Dual Core AMD Opteron(tm) Processor 285
~>ibv_devinfo
hca_id: mthca0
fw_ver: 4.8.200
node_guid: 0002:c902:0024:42f4
sys_image_guid: 0002:c902:0024:42f7
vendor_id: 0x02c9
vendor_part_id: 25208
hw_ver: 0xA0
board_id: MT_0330140002
phys_port_cnt: 2
The attached source code evolved from a huge application in the attempt to
reduce the code to a reasonable size, so it looks weird. Run "gcc -O2
flaw.c -o flaw -lpthread -libverbs" to compile it. On one end run "sudo
flaw", on the other end start "sudo flaw <hostname>", where hostname is the
name of the first end. It fails much sooner if you start another pair of
processes.
This is a typical output of the program:
~>./flaw
device mthca0
completion thread runs
QP connected
55871
147789
241006
330033
421304
509184
595437
682410
779035
872444
964561
1051060
1138062
1224279
1311327
wr_id=00152d78, recv_wr_id_rsp=00152d76, imm.seqn=00002d76
wr_id=00152d79, recv_wr_id_rsp=00152d77, imm.seqn=00002d77
ibv_post_recv() failed, code=-2
wr_id=00152d79, recv_wr_id_rsp=00152d78, imm.seqn=00002d78
ibv_post_recv() failed, code=-2
wr_id=00152d79, recv_wr_id_rsp=00152d79, imm.seqn=00002d79
ibv_post_recv() failed, code=-2
wr_id=00152d7a, recv_wr_id_rsp=00152d7a, imm.seqn=00002d7a
ibv_post_recv() failed, code=-2
The several rows of numbers are the iteration counter printed once per
second. In this case it made at least 1311327 successful iterations.
The iteration #1387894 (0x152d76) failed. In a Receive Work Completion,
wr_id was 00152d78, while 00152d76 was expected. The imm_data received was
good (it must be the 16 LSBs of recv_wr_id_rsp). The Completion caused the
next Receive Work Request to be successfully posted (since no error is printed).
The next iteration #1387895 (0x152d77) failed in a similar fashion. The
Completion, in an attempt to post the next Receive Work Request, got "work
queue overflow" error, which is impossible because the queue size is 8.
All subsequent iterations failed similarly.
The peer process displays no errors.
I am new to libibverbs and it possible that I am misusing it.
Thank you,
Roman Kononov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: flaw.c
Type: text/x-csrc
Size: 16687 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071204/cff19ca6/attachment.c>
More information about the general
mailing list