[ofiwg] the problems with atomics
jgunthorpe at obsidianresearch.com
Tue Jul 7 16:01:40 PDT 2015
On Tue, Jul 07, 2015 at 09:45:52PM +0000, Hefty, Sean wrote:
> > There is no implicit conversion when doing a RDMA READ/WRITE or SEND,
> > so the app is going to be exposed to all of the differences if it
> > wants to run homogeneously.
> RDMA and sends only deal with bytes. They don't attempt to
> interpret data at the remote side.
As far as I see it, there are only three reasonable choices for
1) They work on data in remote memory in a highly specific
format (ie 32 bit unsigned integer in little endian)
2) They work on data in remote memory the same way as the local app
on local memory
3) The work on data in remote memory in the same way as the remote
app on it's local memory
#1 is the basic building block of the other two options.
#2 can be done automatically by having libfabric detect the type set
of the local app and map things like FI_UINT64 to FI_LITTLE_UINT64
#3 can be done by the app by exchanging information outside of
libfabric, eg a specification may say that memory X is layed out using
64 bit big endian integers, when you RDMA READ/WRITE or ATOMIC it then
you always use BE. Or, eg, the app exchanges info about the peer at
Then on top of that, you have to answer the question: what is placed
in the local buffer.
#1 would have definitions like
[Be aware, there is an insane amount of variation for floating point
binary representation, ARM has some exciting wackiness even for 32
bits as well]
#2 would have definitions like
Apps are simple, when you use an atomic and want #2 semantics, you use
the C type name.
You may need funky macros:
#define FI_LONG_DOUBLE (sizeof(long double) == XX?FI_LONG_DOUBLE_80BITS_ALIGNED_128:FI_LONG_DOUBLE_80BITS_ALIGNED_96)
To cover off compiler flag variations.
#3 is harder because you also want to implicitly do type conversions
along the way, eg: the remote may be FI_LITTLE_UINT64, but the local
may be FI_BIG_UINT64. So, an app would naturally want to say:
And have the two NICs do the swapping in hardware.
Take the same idea and extend it to floats,
long double buf;
#1 is fairly unusable, but becomes:
long double usable = convert_ppc_to_local(&buf);
And now you have a sane self consistent API.
More information about the ofiwg