On 6/18/07, <b class="gmail_sendername">Sean Hefty</b> <<a href="mailto:mshefty@ichips.intel.com">mshefty@ichips.intel.com</a>> wrote:<div><span class="gmail_quote"></span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Or Gerlitz wrote:<br>> Looking on cm_sidr_rep_handler we see that the cm id state<br>> is reseted to IB_CM_IDLE, and on the other hand ib_send_cm_dreq<br>> returns -EINVAL if the id state is not IB_CM_ESTABLISHED. I gueess

<br>> this means that rdma_disconnect on RDMA_PS_UDP would never work?<br><br>Correct - there isn't a disconnect for UDP.</blockquote><div><br>

was that done on purpose? is there (eg implementation or spec related) any problem to send DREQ through the CM?<br>

 </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> Thinking on remote qp/lid change, the equivalent I see for UDP based apps,<br>

> is that a remote qp/lid change would have been caught by the local stack<br>> neighbouring system since it sends few unicast arps probes and the re-issues<br>> a broadcast arp from which the new HW address (qpn / gid --> lid) would be learned.

<br>><br>> What you think would be the correct way to solve that for rdmacm based apps?<br><br>I don't know that we can do anything about a QP change.</blockquote><div><br>

Just to emphesize, typical QP change here, is when a remote server

process exits and then spawned again so now the client has to reconnect

else all its packets go nowhere.<br>

 </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> is there a way for the RDMA/IB stack level to provide the solution? we were<br>

<br>Once the inform_info patches are in, we might be able to hook into that<br>to at least provide notification that the remote address has changed.  I<br>don't think there's a LID change notice, though, only GID IN/OUT.  LID

<br>changes would be difficult to hide from the app anyway, since the app<br>must re-create their address vector.</blockquote><div><br>

I did not mean to totally hide from the app (eg to the extent of no

need to re create the address vector), I just wonder if the mechanics

to realize that an unconnected rdmacm id is not "connected" any more

can be fully implemented within the rdmacm.<br>

 </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">If we ever go as far as adding an rdma_send() call, we might be able to<br>hide it better.

</blockquote><div><br>

I don't think we want to go  there.<br>

 </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> I guess that remote lid change can be emulated as disconnect if the rdmacm<br>> would listen on IN/OUT traps, but the question if what can we do about the

<br>> remote process qp, eg in the case the process dies and then comes back again etc.<br><br>I think the current solution is that the app must detect that they are<br>no longer getting responses from the remote side and try to

<br>re-'connect'.  I need to give this more thought to determine if there's<br>anything that we can do here.  (This seems hard without the rdma_cm<br>controlling the QP and CQs.)  Do you have any ideas?</blockquote>

<div><br>

Indeed, this is somehow not easily possible in all cases for us, as we

are not always allowed to add a wire protocol on --this-- QP, but we

are looking into that. Other solution we consider is "invalidate" the

app level "address handle" (IB AH + remote QPN) every ten seconds or so

and then re-connect, but this is not very much efficient.<br>

<br>

Or.<br>

 </div></div><br>