[libfabric-users] multi-receive CQ entry with FI_MULTI_RECV, not the last one read by fi_cq_read()?
gregory.titus at hpe.com
Wed Jun 24 12:19:59 PDT 2020
Hi Sean --
I sent a follow-up this morning, which you may not have seen yet. In that I hypothesized that this was due to the CQ entry ordering not being the same as the message ordering in the buffer because I'd thrown FI_ORDER_SAS on both the sending and receiving endpoints, and the FI_MULTI_RECEIVE CQ entry was effectively in buffer order rather than CQ order.
To answer your other questions, this was with the 1.10.1 release tarball and I'm only using one multi-receive buffer at a time.
From: Hefty, Sean <sean.hefty at intel.com>
Sent: Wednesday, June 24, 2020 1:04 PM
To: Titus, Greg <gregory.titus at hpe.com>; libfabric-users at lists.openfabrics.org <libfabric-users at lists.openfabrics.org>
Subject: RE: multi-receive CQ entry with FI_MULTI_RECV, not the last one read by fi_cq_read()?
> > I post a multi-receive buffer via fi_recvmsg(..., FI_MULTI_RECV) and then process it
> > reading CQ entries which refer to the messages landing there. When I see a CQ entry
> > with FI_MULTI_RECV in its flags I re-post that same multi-receive buffer with
> > fi_recvmsg() again. Based on reading the man pages, I've put an assertion in my CQ
> > entry processing that if an entry has FI_MULTI_RECV in its flags, that entry must be
> > the last one my fi_cq_read() read. Essentially, this assertion is to confirm my
> > understanding that nothing can be placed in the multi-receive buffer after the
> > releases it.
> This should be the case. What version of the library are you using?
> Are you posting more than 1 multi-recv buffer? There were issues in older code where
> providers were reposting multi-recv buffers to the end of the receive queue. So, if
> there were 2 buffers posted, received data would intermingle between them. (Data
> ordering was unaffected).
In addition, there was also an issue in the multi-recv posting path not handling unexpected messages correctly. I don't remember which version of libfabric correct the above issues, but they should be fixed in the upstream code.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Libfabric-users