[openib-general] IPoIB and lid change
Michael Krause
krause at cup.hp.com
Fri Feb 10 10:17:53 PST 2006
At 09:43 AM 2/10/2006, Grant Grundler wrote:
>On Fri, Feb 10, 2006 at 11:05:34AM -0500, Hal Rosenstock wrote:
> > > Hi, Roland!
> > > One issue we have with IPoIB is that IPoIB may cache a remote node path
> > > for a long time. Remote LID may get changed e.g. if the SM is changed,
> > > and IPoIB might lose connectivity.
>
>I wonder if this is why when I reload the IB drivers on one node
>I sometimes have to reload them on other nodes too. Otherwise
>ping over IPoIB doesn't work.
If endnodes are not periodically refreshing their caches or are not
subscribing to event management to be informed a refresh is in order, then
endnodes will fall out of sync and would need to be restarted to establish
communication. This is a classic problem that was illustrated in various
early router protocols and is why today's protocols rely implement a
two-prong approach in many cases - limited cache lifetime and proactive
cache event updates.
> > The remote LID may get changed for other reasons too without an SM
> > change (SM merge of 2 separate subnets). How can this be handled ?
>
>Isn't this just another case of the SM changing for one of the subnets?
A SM merge that involves updating LIDs is a non-trivial event. It requires
connections to be effectively restarted as one cannot ascertain whether all
packets are flushed from the fabric otherwise - that can cause silent data
corruption. For a subsystem such as IPoverIB, a LID update should result
in an unsolicited ARP / ND exchange which will cause all remote endnodes to
receive the new information.
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060210/f4ada55b/attachment.html>
More information about the general
mailing list