[libfabric-users] fi_domain_bind returns function not implemented + fi_read questions

Hefty, Sean sean.hefty at intel.com
Thu Oct 15 10:26:05 PDT 2020

> Not really, I was under the assumption that a domain EQ is required,
> since it is used as the default EQ for associated ressources.
> Thus I assume I can safely remove the creation of the domain EQ.

Correct - it is not technically needed.  You can bind the EQ directly to the endpoints, which is apparently how all apps (including our CI tests) do it.

> I have some additional questions regarding the usage of fi_rma,
> especifically fi_read:
> Firstly: Is it needed to register the target local memory buffer as well
> (and if so why, since it is created by the same application)?

This is determined by checking the FI_MR_LOCAL mr_mode bit.  If that bit is set, then the provider requires that local data buffers be registered.  The reason for the registration is to provide a mapping from the virtual address used by the application to the physical memory address that the hardware will access.  Registration pins that mapping, so that the virtual address does not migrate to a different physical address.

The tcp provider won't need this, however, existing RDMA based hardware does.  You can always register local buffers; there's just a performance cost.

>  From the mans I get that a file descriptor is required for this call. I
> dont think the mans are clear enough here (or I overlooked something).

The fi_read() call needs a memory descriptor if registration is required.  The descriptor can be obtained by calling fi_mr_desc() on the registered region.  If FI_MR_LOCAL has not been set, the descriptor can be NULL.

> Is it the local file descriptor of the locally registered memory buffer
> (case 1) or the "local" file descriptor of the remote memory region
> (case 2).
> So far I get "No such file or directory" from the call. There are 2
> connected endpoints, the access key was transfered from read target to
> calling part of the application.
> Both memory regions are registered.
> I tried case 1 and setting the file descriptor to NULL so far. The first
> one produced said result, the latter one too except for local testing.
> So I assume it is case 2, but wanted to confirm whether there is
> something I dont see.
> for fi_mr_reg:
> No special flags are used
> modes are set to FI_READ | FI_REMOTE_READ | FI_WRITE
> for fi_read:
> a memory buffer is allocated and the length of the remote memory buffer
> is known to the calling side (thus the buffer is of sufficient length)
> Offset stays at 0 (since all data of the target buffer is wanted)

Note that the peer's buffer is identified by the memory key.  The peer obtains this by calling fi_mr_key() after registering the memory.  Any memory accessed remotely through an RMA operation (read or write) must be registered -- consider it opt-in security permission.

The address (addr parameter) passed into fi_read() is the offset into the peer's buffer.  The address may either be 0-based (the default), or based on the virtual address that the peer uses to access the memory.  In the latter case, the FI_MR_VIRT_ADDR mr_mode bit will be set.

As before, with tcp, it can use a 0-based offset.  But RDMA hardware decided that having a base address start at the peer's virtual address made sense to them.  So an offset of 0 is indicated by specifying the peer's virtual address associated with the start of the buffer.

One way to handle this is for the peer to always exchange a base address with the mr key.  If FI_MR_VIRT_ADDR is set, the base address is set to 0.  If FI_MR_VIRT_ADDR is 0, the base address should equal the virtual address of the memory buffer.  The process that initiates the RMA then just uses the provided key and base address that it was given. 

- Sean

More information about the Libfabric-users mailing list