[libfabric-users] what does -FI_EALREADY mean?

Steve Welch swelch at systemfabricworks.com
Mon Jun 29 14:37:48 PDT 2020


Hi Greg, Sean,

> On Jun 29, 2020, at 3:02 PM, Hefty, Sean <sean.hefty at intel.com> wrote:
> 
>> In running a test case at moderately large scale (64 nodes, 128 tx endpoints per node)
>> on a Cray CS system with libfabric 1.10.1 and the verbs;ofi_rxm provider, we saw a -
>> FI_EALREADY ("Operation already in progress") return value from a fi_write() call.  Can
>> anyone out there give me more information as to what that error code might indicate is
>> going wrong?  The man pages don't really contain anything except that error text.
> 
> Searching through the code, I only see FI_EALREADY in a few places, all of which should only be for internal error handling.  For example, RXM uses this to detect if a connection is already in progress, but I don't see that the error code can be returned to the user.  Similarly, verbs has a couple of assertions that FI_EALREADY isn't returned as an error when inserting items into rbtrees.  A free build could return that value back to the user.
> 
> It's possible this is coming from lower level code (e.g. verbs), but I'm skeptical of that.
> 
> Can you run with a debug build to see if you're going through one of the assert paths?  Do you know if you're using XRC for the underlying transport?   The verbs FI_EALREADY asserts are in XRC code paths.

Like Sean said, the XRC asserts are detecting duplicate insertions into RB trees. The one in the file verbs_domain_xrc.c should never happen, since a find is done prior to the insert while holding the Verbs EQ lock. The second in verbs_eq.c is in response to an accept of a shared XRC QP connection. Possibly there is an error condition allowing this. If you could run with logging level=warn and subsys=ep_ctrl then you could catch the output indicating if of these occur. Also, would you mind letting me know what kind of connectivity is required by your application (e.g. all-to-all, many-to-oneā€¦)?

Thanks,
Steve
> 
> - Sean
> _______________________________________________
> Libfabric-users mailing list
> Libfabric-users at lists.openfabrics.org
> https://lists.openfabrics.org/mailman/listinfo/libfabric-users



More information about the Libfabric-users mailing list