[ofiwg] the problems with atomics

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Tue Jul 7 16:01:40 PDT 2015

On Tue, Jul 07, 2015 at 09:45:52PM +0000, Hefty, Sean wrote:

> > There is no implicit conversion when doing a RDMA READ/WRITE or SEND,
> > so the app is going to be exposed to all of the differences if it
> > wants to run homogeneously.
> RDMA and sends only deal with bytes.  They don't attempt to
> interpret data at the remote side.

As far as I see it, there are only three reasonable choices for
 1) They work on data in remote memory in a highly specific
    format (ie 32 bit unsigned integer in little endian)
 2) They work on data in remote memory the same way as the local app
    on local memory
 3) The work on data in remote memory in the same way as the remote
    app on it's local memory

#1 is the basic building block of the other two options.

#2 can be done automatically by having libfabric detect the type set
of the local app and map things like FI_UINT64 to FI_LITTLE_UINT64
(for instance)

#3 can be done by the app by exchanging information outside of
libfabric, eg a specification may say that memory X is layed out using
64 bit big endian integers, when you RDMA READ/WRITE or ATOMIC it then
you always use BE. Or, eg, the app exchanges info about the peer at
run time.

Then on top of that, you have to answer the question: what is placed
in the local buffer.

#1 would have definitions like
[Be aware, there is an insane amount of variation for floating point
 binary representation, ARM has some exciting wackiness even for 32
 bits as well]

#2 would have definitions like

Apps are simple, when you use an atomic and want #2 semantics, you use
the C type name.

You may need funky macros:


To cover off compiler flag variations.

#3 is harder because you also want to implicitly do type conversions
along the way, eg: the remote may be FI_LITTLE_UINT64, but the local
may be FI_BIG_UINT64. So, an app would naturally want to say:


And have the two NICs do the swapping in hardware.

Take the same idea and extend it to floats,

#3 becomes:

#2 becomes:
  long double buf;

#1 is fairly unusable, but becomes:
  ??? buf
  long double usable = convert_ppc_to_local(&buf);
And now you have a sane self consistent API.


More information about the ofiwg mailing list