[libfabric-users] libfabric transaction ordering w.r.t. Chapel memory consistency model

Tue Feb 11 10:11:43 PST 2020

> 	Are you wanting the initiator or the target to know that the data is visible?
> For the target, verbs indicates that the data is visible when a completion is read at
> the target for an operation that followed the write, or if the completion is for the
> write itself (i.e. carries CQ data).
> 
> 
> The initiator.  The requirement here is that a task (Chapel's instance of sequential
> execution) must observe the results of its own regular reads and writes to the same
> address to have occurred in execution order.  I.e., when a task writes to some location
> and then reads from the same location, the value it reads must be the one that was
> written.  Similarly, when a task reads from and then writes to a location, the value
> read must be what the location held before the write.  Or more succinctly, within a
> single task, regular reads and writes to the same location cannot be reordered.  (This
> is for data-race-free programs, so assume no other task is referencing this same
> location during this period.)

This is really an ordering requirement, not a visibility one.  It does sound like you do need rxm to add support for fencing.  That should be possible, eventually...

> 	> ... I also need to ensure that
> 
> 	> when a single task does an atomic op followed by a regular load or store, the
> effect of
> 
> 	> the atomic op on its target object is seen before the load or store references
> memory.
> 
> 
> 	ORDER_WAW orders both atomic updates and RMA write operations against each other.
> ORDER_ATOMIC_WAW and ORDER_RMA_WAW allows specifying those separately.  It sounds like
> ORDER_WAW (etc.) is what you want.
> 
> 
> Beyond saying that it doesn't support FI_FENCE (as discussed below), the fi_rxm man
> page also says that if FI_ATOMIC is specified in the hints capabilities,
> FI_ORDER_{RAR,RAW,WAR,WAW,SAR,SAW} support is disabled.  It also doesn't include
> FI_ATOMIC in the capabilities unless you specifically request it, which may well be
> because of this limitation.  So I'm pretty sure I'm going to be using processor atomics
> done via Active Messages for remote atomic ops with ofi_rxm;verbs.

Unfortunately, verbs devices have extremely limited support for atomic operations, so nearly all of them must be implemented by the CPU.  And trying to keep ordering semantics between RMA and atomics would mean using the CPU, rather than the NIC, for RMA.

> Thanks for all the feedback, Sean!  Not all the answers make me happy from a
> performance point of view, but at least it doesn't sound like I missed any better ways
> of doing things than the ones I'd come up with.

What I would look at is, is there a better performing option over verbs devices, given the semantics that you need?  I don't think that there is generically.  Maybe some optimizations are possible if you knew the traffic mix between atomics and RMA.

You need to fence writes after reads to get WAR ordering.  Verbs HW can do this.  But having to fall back to CPU atomics would require a software based fence.  IMO, it would be best to push that down into libfabric, so Chapel isn't designing around a specific NIC implementation.  I don't have a good answer for maintaining ordering between HW based RMA and CPU atomics.  If the traffic mix is heavy on atomics, or the RMA transfers are usually small, software based RMA may end up being preferred.

- Sean