[ofa-general] Re: Re: IPoIB-CM UC mode

Michael S. Tsirkin mst at dev.mellanox.co.il
Tue Jul 3 02:47:03 PDT 2007


> Quoting Or Gerlitz <ogerlitz at voltaire.com>:
> Subject: Re: [ofa-general] Re: Re: IPoIB-CM UC mode
> 
> Michael S. Tsirkin wrote:
> 
> >>>>I didn't follow this.  Is this just an out of band keep alive message? 
> 
> >>>Yes. Exactly.
> 
> >>You may know that for each neighbour, the Linux network stack sends 
> >>every m jiffies a --unicast-- ARP probe, where after n jiffies there is 
> >>no ARP reply, it sends a broadcast ARP.
> 
> >How does this solve the problem?
> >If the remote side has lost the connection, unicast ARPs will get dropped
> >but broadcast ARPs will get answered to. We'd need to re-create the 
> >connection
> >if this happens - but is there a way to detect this?
> 
> Yes, I know that there is a way to register for kernel level neighbour 
> update events, so on each neighbour update, ipoib cm reconnects, plus 
> you can remove the fast path memcmp we do today on the remote GUID, and 
> we done :)
> 
> This is b/c it covers both the case that the unicast arp probe was not 
> replied either since the --GID-- we have is not the correct one (eg 
> under HA scheme) or that the remote --QP-- is not what we think.

In the typical case (remote side reboots) both the GID and the UD QPN stay the
same, so it seems there won't be any neighbour update, right?  If so, while
playing with neighbour update events might get us data path speed-up, it will
not solve the problem of detecting the connection is alive.


-- 
MST



More information about the general mailing list