[libfabric-users] fi_read questions

Arne Struck arnestruck at astruck.de
Thu Oct 15 13:45:13 PDT 2020


>> I have some additional questions regarding the usage of fi_rma,
>> especifically fi_read:
>>
>> Firstly: Is it needed to register the target local memory buffer as well
>> (and if so why, since it is created by the same application)?
> This is determined by checking the FI_MR_LOCAL mr_mode bit.  If that bit is set, then the provider requires that local data buffers be registered.  The reason for the registration is to provide a mapping from the virtual address used by the application to the physical memory address that the hardware will access.  Registration pins that mapping, so that the virtual address does not migrate to a different physical address.
>
> The tcp provider won't need this, however, existing RDMA based hardware does.  You can always register local buffers; there's just a performance cost.


Since at the moment I am using sockets and tcp provider, no receiving 
memory registration is not needed.

What about verbs provider? I d like to tesk the program via infiband.


>> Is it the local file descriptor of the locally registered memory buffer
>> (case 1) or the "local" file descriptor of the remote memory region
>> (case 2).
>>
>>
>> So far I get "No such file or directory" from the call. There are 2
>> connected endpoints, the access key was transfered from read target to
>> calling part of the application.
>>
>> Both memory regions are registered.
>>
>> I tried case 1 and setting the file descriptor to NULL so far. The first
>> one produced said result, the latter one too except for local testing.
>>
>> So I assume it is case 2, but wanted to confirm whether there is
>> something I dont see.
>>
>>
>> for fi_mr_reg:
>>
>> No special flags are used
>>
>> modes are set to FI_READ | FI_REMOTE_READ | FI_WRITE
>>
>>
>> for fi_read:
>>
>> a memory buffer is allocated and the length of the remote memory buffer
>> is known to the calling side (thus the buffer is of sufficient length)
>>
>> Offset stays at 0 (since all data of the target buffer is wanted)
> Note that the peer's buffer is identified by the memory key.  The peer obtains this by calling fi_mr_key() after registering the memory.  Any memory accessed remotely through an RMA operation (read or write) must be registered -- consider it opt-in security permission.


And the memory key is determined by the key it is given at creation.

Is it possible another key than the given one is chosen constantly if a 
64 bit key is generated (so close to 0 cahnce of collision)?

> The address (addr parameter) passed into fi_read() is the offset into the peer's buffer.  The address may either be 0-based (the default), or based on the virtual address that the peer uses to access the memory.  In the latter case, the FI_MR_VIRT_ADDR mr_mode bit will be set.

I assumed that the mr_modes are set by user interaction. Since I dont 
set FI_MR_VIRT_ADDR (in fact none) so far the mr_mode should be 
FI_MR_SCALABLE/./

Thus it should be a 0-based offset (which is set to 0 at my application).
> As before, with tcp, it can use a 0-based offset.  But RDMA hardware decided that having a base address start at the peer's virtual address made sense to them.  So an offset of 0 is indicated by specifying the peer's virtual address associated with the start of the buffer.

In the target system there is no special RDMA hardware I know of.

> One way to handle this is for the peer to always exchange a base address with the mr key.  If FI_MR_VIRT_ADDR is set, the base address is set to 0.  If FI_MR_VIRT_ADDR is 0, the base address should equal the virtual address of the memory buffer.  The process that initiates the RMA then just uses the provided key and base address that it was given.

What I can do is:

Verify that the mr_mode is indeed set to FI_MR_SCALABLE or another mr_mode.

Send the key returned by the fi_mr_key function and not directly the 
generated ones. /
/

Any other idea what I could be doing wrong/check? What additional 
information could I provide to help in the search for my misunderstanding?

At least to me it seems that I am doint the steps right.



//

greetings, Arne.

/
/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20201015/830bba83/attachment.htm>


More information about the Libfabric-users mailing list