[libfabric-users] fi_read questions
arnestruck at astruck.de
Fri Oct 16 09:30:29 PDT 2020
At the moment I am using the sockets provider for this for legacy
reasons (I know it is deprecated, but from my understanding should still
support this functionality). tcp and verbs shall be tested afterwards
with respective adjustments.
I get the ENOENT from the fi_read call directly, which to me suggests
that it cant find the peer memory region.
I have tried to set the mr_mode flag to FI_MR_VIRT_ADDR in hints without
change. The provider seems to clear it (at least it is not contained in
the fi_info corresponding to endpoint and domain via
(info->domain_attr->mr_mode & FI_MR_VIRT_ADDR) == FI_MR_VIRT_ADDR ).
In fact mr_mode is set to 0 by the provider.
Thus I reverted the call to its original form with buf being the
allocated memory registered in memory_region:
fi_read(endpoint, (void*)buf, (size_t)length, fi_mr_desc(memory_region),
0, 0, key, NULL)
I dont really know where the problem lies.
Peer memory region is registered via
error = fi_mr_reg(endpoint, message_data->data, message_data->length,
FI_READ|FI_REMOTE_READ|FI_WRITE, 0, key, 0, &memory_region, NULL);
I replaced config calls with the actual values set for config. Key is
transfered by the peer to the calling part of the application via
fi_mr_key(memory_region) and a message.
>> The address (addr parameter) passed into fi_read() is the offset into the peer's
>> buffer. The address may either be 0-based (the default), or based on the virtual
>> address that the peer uses to access the memory. In the latter case, the
>> FI_MR_VIRT_ADDR mr_mode bit will be set.
>> I assumed that the mr_modes are set by user interaction. Since I dont set
>> FI_MR_VIRT_ADDR (in fact none) so far the mr_mode should be FI_MR_SCALABLE.
>> Thus it should be a 0-based offset (which is set to 0 at my application).
> The app is supposed to set which mr_mode bits it _can_ support. The provider will clear the bits that it doesn't require.
>> One way to handle this is for the peer to always exchange a base address with the
>> mr key. If FI_MR_VIRT_ADDR is set, the base address is set to 0. If FI_MR_VIRT_ADDR
>> is 0, the base address should equal the virtual address of the memory buffer. The
>> process that initiates the RMA then just uses the provided key and base address that it
>> was given.
What do you mean with if FI_MR_VIRT_ADDR is 0? Isnt that just one value
set in a bitmask? If not so, how do you check if mr_mode contains
FI_MR_VIRT_ADDR (or any other mode bits set).
More information about the Libfabric-users