[ofiwg] A question on FI_DELIVERY_COMPLETE
sean.hefty at intel.com
Mon Oct 26 16:40:44 PDT 2015
> FI_DELIVERY_COMPLETE is intended only to apply to the initiator of an
> [PG] I suspected as much.
> The generation of a notification at the target is assumed to occur after
> the operation has completed -- i.e. any transferred data is available.
> This holds whether the completion is an entry placed into a CQ, or a
> completion counter has been incremented.
> [PG] Using one of today's well-known networks as an example, there is no
> way to guarantee that data is actually visible to the responder before
> posting a completion to the CQ. I assume that for backward compatibility
> reasons we would want to maintain that behavior. That implies that it
> would be desirable to define some other behavior in which the responder
> side provider, through some internal mechanism, guarantees that the data
> is visible to the consumer before signaling the completion to the consumer
> (whether it is via a completion event or a counter increment). It would
> be the moral equivalent of FI_DELIVERY_COMPLETE, but on the responder
Which CQ are you referring to? If a completion is written at the target, then the data associated with it better be visible to the target process. Otherwise the completion is meaningless. Consider a CQ entry for a received message.
> The FI_REMOTE_CQ_DATA flag is somewhat independent of this. That flag
> just means that application data was written into a CQ entry.
> [PG] I don't quite understand this. As I read it, remote cq data is the
> moral equivalent of immediate data, and FI_REMOTE_CQ_DATA is the mechanism
> that causes the requester to send immediate data. The presence of this
> Remote CQ Data, in turn, causes a completion event on the remote side,
> which might be either an event posted to the completion queue, or the
> increment of a counter. (In the case of IB, it also causes the consumption
> of a RECV WQE, but that isn't the case with libfabric.)
Nit: FI_REMOTE_CQ_DATA only applies to CQ entries, not counters.
With libfabric, the use of FI_REMOTE_CQ_DATA is not required in order to generate a completion at the target. E.g. a RMA write operation can increment a completion counter or generate a CQ entry at the target without remote CQ data present. Similarly, an RMA read operation can increment a completion counter or generate a CQ entry. *If* FI_REMOTE_CQ_DATA is present, then a CQ entry will always be generated at the target for a successful operation. This is the behavior that applications requested.
> Although IB cannot generate target notification without FI_REMOTE_CQ_DATA
> (i.e. immediate data), libfabric does not require this.
> [PG] Other than a subsequent send message, how can libfabric generate a
> notification on the target side other than using FI_REMOTE_CQ_DATA?
This is provider specific. But there's nothing special about generating a CQ entry. I can't think of any reason why IB hardware couldn't easily be adapted to generate a CQ entry in response to receiving an RMA write operation (without immediate data), for example, other than the spec doesn't define it.
> The generation of a completion entry at the target is independent of the
> completion mode selected by the initiator.
> [PG] Agreed. I am suggesting a mechanism that controls the generation of
> a completion entry at the responder side.
The target side controls whether a completion entry is generated through the use of completion flags (e.g. FI_REMOTE_WRITE, FI_REMOTE_READ) when binding a CQ or counter to an endpoint. There are not the same level of 'completion modes' on the target side as there are at the initiator. A completion entry indicates that the operation is done. There is no notification for operations that are in progress.
More information about the ofiwg