[ofiwg] libfabric network atomic operations, processor atomic ops, and coherence

Hefty, Sean sean.hefty at intel.com
Tue Feb 26 14:52:19 PST 2019


> > The point I was making above was related to data correctness.  If 2 or more
> 'actors' are both performing atomic operations on the same target memory, the
> result is undefined.  An actor can be a NIC or CPU.
> >
> > E.g. An atomic through NIC A adds 1 to each element.  An atomic through NIC
> B subtracts 1 to each element.  The results may end up with each element
> unchanged, incremented by 1, or decremented by 1.  And the change may not be
> the same for each element -- some may be +1, some -1, some unchanged.
> 
> Okay, I can work with that.  The model I have in my head is that the
> operations on a given location are divided into epochs, where within each
> epoch all the atomic ops are done by just one actor.  To switch from one epoch
> to another, and thus from one actor to another, the completions for all the
> ops in the old epoch have to have been seen before the first op in the new
> epoch is initiated.  (Where “before” has all the usual caveats and limitations
> with respect to parallel programming.)  No prob.

Yes, this is the intended model to support.  I'm actively updating the man pages to reflect this and will open a new pull request.

- Sean


More information about the ofiwg mailing list