[libfabric-users] fi_read questions

Arne Struck arnestruck at astruck.de
Thu Oct 15 14:35:04 PDT 2020

>> 	The address (addr parameter) passed into fi_read() is the offset into the peer's
>> buffer.  The address may either be 0-based (the default), or based on the virtual
>> address that the peer uses to access the memory.  In the latter case, the
>> FI_MR_VIRT_ADDR mr_mode bit will be set.
>> I assumed that the mr_modes are set by user interaction. Since I dont set
>> FI_MR_VIRT_ADDR (in fact none) so far the mr_mode should be FI_MR_SCALABLE.
>> Thus it should be a 0-based offset (which is set to 0 at my application).
> The app is supposed to set which mr_mode bits it _can_ support.  The provider will clear the bits that it doesn't require.

Ok, so I should set FI_MR_VIRT_ADDR nevertheless. From what I read tpc 
and sockets Provider sopport this mode

>> 	As before, with tcp, it can use a 0-based offset.  But RDMA hardware decided that
>> having a base address start at the peer's virtual address made sense to them.  So an
>> offset of 0 is indicated by specifying the peer's virtual address associated with the
>> start of the buffer.
>> In the target system there is no special RDMA hardware I know of.
>> 	One way to handle this is for the peer to always exchange a base address with the
>> mr key.  If FI_MR_VIRT_ADDR is set, the base address is set to 0.  If FI_MR_VIRT_ADDR
>> is 0, the base address should equal the virtual address of the memory buffer.  The
>> process that initiates the RMA then just uses the provided key and base address that it
>> was given.
>> What I can do is:
>> Verify that the mr_mode is indeed set to FI_MR_SCALABLE or another mr_mode.
> If you're using libfabric v1.9 (or anything after 1.5), think of FI_MR_SCALABLE as no longer existing.  :)  Use the individual mr_mode bit flags.
Will do so.

>> Send the key returned by the fi_mr_key function and not directly the generated ones.
> fi_mr_key() is always usable.  Even if the app specifies the key, it will just return the app's value.
Ok, so basically 0 is a valid input and forces libfabric to generate a 
value (if it doesnt do so automatically)
>> Any other idea what I could be doing wrong/check? What additional information could I
>> provide to help in the search for my misunderstanding?
> You mentioned that you were receiving errno ENOENT (no such file or directory).  How were you receiving this?  Error code returned from a call?  As part of a completion?  Which provider were you using?
It is returned from the fi_read call

Will update tomorrow, it is getting late around here. Thanks for the 
help so far.

Greetings , Arne

More information about the Libfabric-users mailing list