[libfabric-users] Cannot establish a connection with verbs; ofi_rxm

Hubert Hirtz hubert.hirtz at laposte.net
Wed Jun 17 23:39:32 PDT 2020


Hi,

I am trying to make an application developed with the UDP provider
(using datagrams) work on InfiniBand hardware (with verbs and RxM).

The application does the following:

1. Initialize libfabric structures (EQ, CQ, AV, domain, provider) this way:

    eq_attr.size = 64;
    cq_attr.format = FI_CQ_FORMAT_MSG;
    cq_attr.size = 64;
    av_attr.flags = FI_EVENT;
    av_attr.type = FI_AV_TABLE;

    hints = fi_allocinfo();
    if (!hints)
        return -FI_ENOMEM;
    hints->ep_attr->type = FI_EP_RDM;
    hints->caps = FI_MSG | FI_SOURCE;
    hints->mode = FI_CONTEXT;
    fi_getinfo(FI_VERSION(1, 5), 0, port, port?FI_SOURCE:0, hints, &pv);

2. The server calls `fi_recv` and waits for a completion,

3. The client sends its socket address in a message to the server and
waits for a completion.

However, it seems the client-side `fi_cq_read` is always returning
`FI_EAGAIN`.  Also the `FI_CONTEXT` mode doesn't appear in the output of
`fi_tostr(pv, FI_TYPE_INFO)`.

The logs[0] show at line #178-187 that no verbs provider have been
found, however `fi_getinfo` still returns a valid verb provider.  I am
confused, could you please shed some light on this?

Cheers,
Hubert Hirtz

[0] https://paste.sr.ht/~taiite/3c575bd291301366cc7a6ce3474749c3a1ca420b


More information about the Libfabric-users mailing list