[libfabric-users] RDM send fails

Jörn Schumacher jorn.schumacher at cern.ch
Wed Nov 28 12:39:45 PST 2018


On 11/27/2018 05:49 PM, Hefty, Sean wrote:
>> Since a couple of days I am trying to use libfabric RDM endpoints (so
>> far I only worked with connected endpoints) to send a single message
>> between two applications. I am using libfabric 1.6.1 with the
>> verbs/RxM provider.
>>
>> My attempt is based on the example from the tutorial here:
>> https://www.slideshare.net/dgoodell/ofi-libfabric-tutorial
>>
>> A stripped-down version of my code is here:
>> https://gist.github.com/joerns/a0059c3f591db42c88b004df6883fce9
>>
>>
>> Q1: I run into the issue that fi_send returns FI_EAGAIN, indicating
>> that some resource is not available. What could be the issue? I have
>> QP, AV and a registered buffer, to my understanding nothing else is
>> needed for RDM.
> 
> RxM creates the underlying connection only when the first transfer is attempted to a given peer.  You will see EAGAIN until the connection completes.

That makes sense. I guess there is no way to get notified of the 
connection via an event?

>> Q2: It is not clear to me what arguments I have to supply for
>> node/service for the fi_getinfo call on the sender side. In the
>> tutorial the destination address is used, but that does not make sense
>> if I have multiple destinations. I cannot put NULL for both as that
>> yields -61(No data available).
> 
> You can specify the source address that you want to initiate transfers from, using the FI_SOURCE flag.  If you specify a destination address, it will be used to select the source address based on whatever routing information exists.  So, if you have multiple destinations, you can pick the one that you want used for the source address selection.

Great, thanks!

Cheers,
Jörn


More information about the Libfabric-users mailing list