[libfabric-users] Verbs provider not permitting FI_EP_MSG

Philip Davis philip.e.davis at rutgers.edu
Thu Jan 16 06:56:24 PST 2020


Hello,

I am working with a user that is running on an older Infiniband cluster. Using libfaric with the following hints:

hints->caps = FI_MSG | FI_SEND | FI_RECV | FI_REMOTE_READ |
                  FI_REMOTE_WRITE | FI_RMA | FI_READ | FI_WRITE;
    hints->mode = FI_CONTEXT | FI_LOCAL_MR | FI_CONTEXT2 | FI_MSG_PREFIX |
                  FI_ASYNC_IOV | FI_RX_CQ_DATA;
    hints->domain_attr->mr_mode = FI_MR_BASIC;
    hints->domain_attr->control_progress = FI_PROGRESS_AUTO;
    hints->domain_attr->data_progress = FI_PROGRESS_AUTO;
    hints->ep_attr->type = FI_EP_RDM;


No verbs providers are found. Looking through the debug output, I suspect this is the crucial line:

libfabric:verbs:fabric:fi_ibv_get_matching_info():1213<info> hints->ep_attr->rx_ctx_cnt != FI_SHARED_CONTEXT. Skipping XRC FI_EP_MSG endpoints

I take it that the underlying hardware is only compatible with FI_PROTO_RDMA_CM_IB_XRC protocol for MSG endpoints, and it looks like I need to have FI_SHARED_CONTEXT enabled for these endpoints to be supported. I’m having some trouble understanding the implications of using FI_SHARED_CONTEXT. If I only ever use one endpoint, is there any functional or performance impact to setting this? I’d rather not change to using shared contexts unconditionally, so is there a good way for me to detect this situation other than to do a maximally permissive fi_getinfo and iterate through the verbs results?

Thanks,
Philip


More information about the Libfabric-users mailing list