[libfabric-users] Sockets is writing two errors to cq on disconnects

Hefty, Sean sean.hefty at intel.com
Thu Apr 2 09:42:34 PDT 2020


> hints->domain_attr->mr_mode = FI_MR_ALLOCATED | FI_MR_VIRT_ADDR | FI_MR_PROV_KEY
> 
> libfabric:18695:tcp:ep_data:tcpx_validate_rx_rma_data():387<warn> invalid rma iov
> received
> libfabric:18695:tcp:domain:tcpx_get_rx_entry_op_write():544<warn> invalid rma data
> 
> I probably have to debug the internal code again to know where the error lies.

Check the mr_mode bits being returned.  They are likely being cleared.  If FI_MR_VIRT_ADDR is not set, then the RMA should use a 0-based address as the base, rather than the peer's virtual address for the region.

- Sean


More information about the Libfabric-users mailing list