[ofa-general] disconnect implementation for rdma cm unconnected datagram service

Or Gerlitz ogerlitz at voltaire.com
Sun Jun 17 02:17:24 PDT 2007


Hi Sean,

Looking on cm_sidr_rep_handler we see that the cm id state
is reseted to IB_CM_IDLE, and on the other hand ib_send_cm_dreq
returns -EINVAL if the id state is not IB_CM_ESTABLISHED. I gueess
this means that rdma_disconnect on RDMA_PS_UDP would never work?

Now, even with fixing that, the disconnect packets can get lost or the
remote side can reboot/etc before the CM manages to send the DREQ packet/s.

Thinking on remote qp/lid change, the equivalent I see for UDP based apps,
is that a remote qp/lid change would have been caught by the local stack
neighbouring system since it sends few unicast arps probes and the re-issues
a broadcast arp from which the new HW address (qpn / gid --> lid) would be learned.

What you think would be the correct way to solve that for rdmacm based apps?
is there a way for the RDMA/IB stack level to provide the solution? we were
considering few alternatives but they all at the app level (eg send probes
to the remote qp/lid, add another RC connection just for the sake of knowing
the remote process is still there, etc).

I guess that remote lid change can be emulated as disconnect if the rdmacm
would listen on IN/OUT traps, but the question if what can we do about the
remote process qp, eg in the case the process dies and then comes back again etc.

thanks,

Or.



More information about the general mailing list