[ofa-general] disconnect implementation for rdma cm unconnected datagram service

Or Gerlitz or.gerlitz at gmail.com
Mon Jun 18 13:46:33 PDT 2007


On 6/18/07, Sean Hefty <mshefty at ichips.intel.com> wrote:
>
> > was that done on purpose? is there (eg implementation or spec related)
> > any problem to send DREQ through the CM?
>
> This is spec related - DREQ doesn't apply to UD QPs - only connected.


I see.

> I did not mean to totally hide from the app (eg to the extent of no need
> > to re create the address vector), I just wonder if the mechanics to
> > realize that an unconnected rdmacm id is not "connected" any more can be
> > fully implemented within the rdmacm.
>
> I don't see a way to do this underneath within the existing spec.  If
> the IB CM tracked SIDR lookups, maintaining state information, then we
> could make use of a DREQ type command to notify the remote side the the
> local QP is going away.  But this is outside of the spec, plus doesn't
> solve all of the issues (like a remote system reboot).
>
> I don't think there's even an existing trap that we can use.


I see.

> Indeed, this is somehow not easily possible in all cases for us, as we
> > are not always allowed to add a wire protocol on --this-- QP, but we are
> > looking into that. Other solution we consider is "invalidate" the app
> > level "address handle" (IB AH + remote QPN) every ten seconds or so and
> > then re-connect, but this is not very much efficient.
>
> How does IPoIB handle this?  Does it just time out the ARP entries every
> x minutes, which requires a new lookup?


its not  IPoIB but rather the neighbouring subsystem of the IP stack, it
sends unicast arp probes every n seconds, and if m probes fail, it sends a
broadcast arp. n and m are parameters that can be changed where I think the
default is n=20sec m=3

Is there some way that you could map LIDs to QPNs, and use the
> SLID/src_qp data in the work completion to see if a remote service has
> moved QPs?


if the communication pattern is that both A sends to B and B sends to A,
then there is some path to follow here, namely for each packet (work
completion) A gets to B it checks if B's QPN has been changes, and if yes,
it does re-connect.

Or
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070618/0cc0e9f3/attachment.html>


More information about the general mailing list