[ofiwg] completion flags as actually defined by OFI
jgunthorpe at obsidianresearch.com
Thu Apr 16 16:10:30 PDT 2015
On Thu, Apr 16, 2015 at 10:23:35PM +0000, Hefty, Sean wrote:
> > > *FI_DELIVERY_COMPLETE*
> > > : Indicates that a completion should not be generated until an operation
> > > has been processed by the destination endpoint(s).
> > > A completion guarantees that the targets have visibility into the
> > > results of the operation.
> > Suggest:
> > 'the results of the operation are visible to all observers, including
> > the target's CPU'
> Part of the semantic that needs to be captured is that the target
> process _may_ need to take additional steps in order to access the
TRANSMIT_COMPLETE means the data is guarenteed to be delivered, but
isn't visible to the target CPU+application.
The only improvement DELIVERY_COMPLETE can make on that guarentee is
to also guarentee that the data is visible to the target CPU.
Any other difference between the two is a hidden implementation detail
an application cannot possibly care about.
Another way to phrase it, that you might like better:
visible == the data has been transfered across the
provider/application boundary and is guarenteed to be visible to the
> For example, the process may need to provide a buffer that the data
> can be placed into. I was intentionally vague on the meaning of
> 'visible', because that is provider/operation/target specific.
I would argue if an application buffer is not immediately available
to put the data into, then DELIVERY_COMPLETE is not a possible
completion semantic for that transfer.
> may be possible even with current HW through the protocol, such as
> following a write by a read.
Sure, libfabric could detect that the peer is able to support
write,read as synchronizing with the CPU, and provide
DELIVERY_COMPLETE semantics that way.
> This isn't attempting to make any claims regarding atomic coherency.
> It's attempting to say that the results are visible at both the
> initiator and target. For example, the completion of a
> compare-swap-fetch operation is now accessible by the source and
> destination. See my comment above about what it means to be
Eh? atomic coherency is directly connected to the concept of
visibility. If A changes a value and B cannot see it immediately then
A&B are not coherent.
I would use this completion semantic to indicate coherent or
incoherent atomics are supported. DELIVERY_COMPLETE (cpu visible) vs
TRANSMIT_COMPLETE (!cpu visible) is exactly the semantic difference
between CPU coherent and CPU incoherent atomics.
More information about the ofiwg