[libfabric-users] Sockets is writing two errors to cq on disconnects
Carsten Patzke
carsten.patzke at desy.de
Thu Apr 2 04:57:27 PDT 2020
When using tcp;ofi_rxm, "No route to host" is being reported when calling fi_send.
No entry to the queue is made.
Which I guess is perfectly acceptable.
The reason why I use sockets intend of tcp;ofi_rxm is that I got RMA issues.
I used the same arguments as in verbs;ofi_rxm and sockets.
hints->domain_attr->mr_mode = FI_MR_ALLOCATED | FI_MR_VIRT_ADDR | FI_MR_PROV_KEY
libfabric:18695:tcp:ep_data:tcpx_validate_rx_rma_data():387<warn> invalid rma iov received
libfabric:18695:tcp:domain:tcpx_get_rx_entry_op_write():544<warn> invalid rma data
I probably have to debug the internal code again to know where the error lies.
----- Original Message -----
From: "Sean Hefty" <sean.hefty at intel.com>
To: "Carsten Patzke" <carsten.patzke at desy.de>, "libfabric-users" <libfabric-users at lists.openfabrics.org>
Sent: Thursday, April 2, 2020 12:19:31 AM
Subject: RE: Sockets is writing two errors to cq on disconnects
> I am currently using sockets;rdm and I've noticed that when I try to send data to an
> already closed connection
> two error completions will be generated when called for the first time.
Can you test with tcp;rxm and see if that works for your application?
> The first error completion is just a mostly empty one with no context and an error of
> FI_EIO.
> The second error is the expected one with proper context and buffer (, err is also
> FI_EIO).
>
> Is this intended?
This sounds like a bug.
> Is there a way to catch the first error without just checking if the context is set?
I would need to see understand why there are 2 error entries being written.
> My plan in general is to detect if there are stale requests and
> after a certain timeout I try to send a ping package to the other peer to check if the
> connection is still working.
Hmm... We should look at exposing some sort of keep-alive option to apps. I'll think about this and see what could be done.
- Sean
More information about the Libfabric-users
mailing list