[libfabric-users] fi_read questions

Hefty, Sean sean.hefty at intel.com
Tue Oct 20 11:57:50 PDT 2020

> First of all I think I have to explain something. In the program I build
> one connection for message based communication and another for rma based
> communication. Basically to be able to use different Provider for each
> communication type later (both allocate their own fabric, domain, eq and
> cqs, so no complications there).

This can be fine.  But there's no ordering guarantees between different endpoints.  So if you issue an RMA on one, then send a message on another which indicates that the RMA has been transferred, it's possible for the message to arrive before the RMA has completed.

> Both endpoints have as dest-address.
> Since I am using sockets provider for testing at the moment and message
> based communication worked on that endpoint, I tried to call the
> endpoint for message.

The sockets provider starts with implementing reliable, unconnected communication (RDM endpoints).  It layers other endpoint types, including MSG endpoints, over that.  It's goofy.  But most of the original target applications wanted RDM endpoint semantics.

The tcp provider implements only MSG endpoints over tcp sockets, which is the natural mapping.

If you can switch to using tcp, that would be preferred.  But either provider should work.  The use of MSG endpoints with the tcp provider is just more widely tested than the MSG endpoint support from the socket provider.

> Long story short: Not everything is fine (there seems to be a problem
> with transmited data), but most of the data transfer seems to work.
> Have you any idea why this is? Is it not possible to build such an
> infrastructure?
> A question regarding the verbs-Provider: The mans mention that it does
> not support FI_SOURCE for FI_EP_MSG. Does that mean the capability or
> the flag for fi_getinfo? I assume it is the capabilit y. If it is not,
> how to give a passive endpoint the address to listen to?

The FI_SOURCE flag for fi_getinfo() is supported.  The restriction is referring to returning source addressing from fi_cq_readfrom().  That is only defined for unconnected endpoints.  That is, fi_addr_t is associated with using address vectors, which are only used with DGRAM and RDM endpoints.

- Sean

More information about the Libfabric-users mailing list