[libfabric-users] verbs WAIT_FD signalling at all times

Carlo Alberto Gottardo carlo.gottardo at cern.ch
Tue Jun 29 09:11:07 PDT 2021

Dear Libfabric Users & Developers,

using the verbs provider I use the FI_WAIT_FD wait object for the completion queue (CQ).
The resulting file descriptor (FD), associated to the libibverbs completion channel, is added to an epoll on waiting for EPOLLIN.
The FD signal triggers the callback where fi_cq_read reads the completions in a non-blocking way.

The problem is that, as soon as a connection is established, the FD keeps signalling as fast as the CPU allows for, even if there is no data transfer.
As a matter of fact, after the first call, fi_cq_read keeps returning EAGAIN, sign there there is no CQ entry to read.

I would expect the fd not to signal if there's nothing to read in the CQ. Shouldn't be this the case?

I read some libibverbs documentation and this point still remains unclear to me.

Below I post the CQ attributes and some function calls.

Thank you very much for your help,

Carlo A. Gottardo
Postdoc at Nikhef
Skype: carlogottardo

System: Libfabric 1.12.1 / Centos7 / Mellanox Connect-X5

CQ attributes

struct fi_cq_attr cq_attr;
cq_attr.size = MAX_CQ_ENTRIES;
cq_attr.flags = 0;
cq_attr.format = FI_CQ_FORMAT_DATA;
cq_attr.wait_obj= FI_WAIT_FD;
cq_attr.signaling_vector = 0;
cq_attr.wait_cond = FI_CQ_COND_NONE;
cq_attr.wait_set = NULL;

the queue is open and bonded with

fi_cq_open(rsocket->domain, &cq_attr, &rsocket->cq, NULL)))
fi_ep_bind((rsocket->ep), &rsocket->cq->fid, FI_TRANSMIT|FI_RECV)))

when the connection is established the wait object is retrieved with

fi_control(&socket->cq->fid, FI_GETWAIT, &socket->cqfd)

the file descriptor is assigned a callback

      socket->cq_ev_ctx.fd = socket->cqfd;
      socket->cq_ev_ctx.data = socket;
      socket->cq_ev_ctx.cb = on_recv_socket_cq_event;

and finally the file descriptor is added to the the main and only epoll of the application, which waits for EPOLLIN.

In the on_recv_socket_cq_event callback:

        fi_cq_read(socket->cq, &completion_entries, N);

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20210629/c8e13d1f/attachment-0001.htm>

More information about the Libfabric-users mailing list