[libfabric-users] RDM send fails

Ilango, Arun arun.ilango at intel.com
Fri Nov 30 12:40:29 PST 2018

RxM can wake up application thread waiting on CQ fd on connection events. But it isn't guaranteed that fi_send would be successful as it might take some time for the connection to be established. If this is the case fi_send would return EAGAIN and app should wait on the CQ fd again.

Also, in manual progress mode, app would have to do a fi_cq_read (to progress connection) before waiting on CQ fd.


-----Original Message-----
From: Libfabric-users [mailto:libfabric-users-bounces at lists.openfabrics.org] On Behalf Of Hefty, Sean
Sent: Friday, November 30, 2018 8:24 AM
To: Jörn Schumacher <jorn.schumacher at cern.ch>; Sur, Sayantan <sayantan.sur at intel.com>; libfabric-users at lists.openfabrics.org
Cc: ofiwg at lists.openfabrics.org
Subject: Re: [libfabric-users] RDM send fails

> Thanks Sayantan. Sure we have to guard against EAGAIN, the question is 
> how. Ideally I would like to avoid polling on the send call (or any 
> other call for matter).
> To give you background, in my application I run a central epoll-driven 
> event loop for CQ events, CM events, ... but also IO events from other 
> non-network devices. So it would be natural to also throw the RDM 
> endpoint in there and have the event loop tell me when I can send.

In most cases, waiting on the tx/rx CQ fds should work.  It won't guarantee that send won't return EAGAIN after the thread wakes up, but the thread should wake up when send is available.

Whether this is true during the connection setup period is something that I'd have to look at the implementation to tell.  I don't know which fd the underlying CM events signal, and if we can map those to the CQ fd.  But this seems like a reasonable request by the apps for dealing with EAGAIN without needing to spin.

- Sean
Libfabric-users mailing list
Libfabric-users at lists.openfabrics.org

More information about the Libfabric-users mailing list