[ofiwg] libfabric network atomic operations, processor atomic ops, and coherence
Hefty, Sean
sean.hefty at intel.com
Tue Feb 26 14:09:13 PST 2019
> > There is no guarantee that NIC/network based atomics will be coherent with
> CPU based atomics, or that they will be coherent between NICs, or the final
> result will even be atomic. [...]
>
> Would I be correct in reading that last clause as “or [that] the [visibility
> of the] final result will even be atomic”, meaning that visibility of one
> subpart (byte, for example) of the result should not be taken as evidence of
> visibility of the whole? Or in other words, that the paragraph about CPU
> visibility provides the only guarantees of visibility. (So if tearing occurs,
> it will have been dealt with before that guaranteed-visible point.)
I wasn't trying to write spec language. :)
The point I was making above was related to data correctness. If 2 or more 'actors' are both performing atomic operations on the same target memory, the result is undefined. An actor can be a NIC or CPU.
E.g. An atomic through NIC A adds 1 to each element. An atomic through NIC B subtracts 1 to each element. The results may end up with each element unchanged, incremented by 1, or decremented by 1. And the change may not be the same for each element -- some may be +1, some -1, some unchanged.
The visibility discussion in the man page is describing when another actor can see the results of an atomic operation. Before a second actor can perform atomic operations on a target region, the results of the first actor must first be visible. This would ensure that the target region is updated in a consistent manner.
- Sean
More information about the ofiwg
mailing list