[ofa-general] Re: Re: IPoIB-CM UC mode

Michael S. Tsirkin mst at dev.mellanox.co.il
Tue Jul 3 11:37:03 PDT 2007


> Quoting Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [ofa-general] Re: Re: IPoIB-CM UC mode
> 
> >Hmm, I don't see how REQ gives you data on existing connection. Further,
> >this would need a spec extension to define private data format then?
> >LAP trick works out of the box ...
> 
> LAP keep-alives requires the apps to implement the keep alive timers and 
> detection, but sends the messages out-of-band.  Why not send the 
> messages in-band?

Sure, this can be done. But that'd need ULP support, in this case IPoIB protocol
extension.  Further, if remote is up, it's nice to get a CM message saying
"connection was lost" directly rather than just a timeout.
What real advantages are there for doing this "in-band" as you say?

> Would it make more sense to implement the entire 
> keep-alive solution in the CM?

I think it doesn't matter much. Let's keep it where it's needed:
if more UC applications surface, we can rethink this decision,
and factor the code out.

> >I actually think a single working solution is enough.
> >No need to explore all of them :).
> 
> I'm not saying implement all of them, just make sure that we have the 
> best solution.  I can't think of one that I like better than using LAP, 
> but it feels like the CM protocol / MADs are being hijacked.  For 
> example, if there's only one path between two nodes, LAP doesn't really 
> make any sense, but it ends up being used.  Should we instead look at 
> adding new CM messages for just this purpose?

Sure, I agree, this would be nice. But I expect this will take a while
to get the standartization rolling. So I think we'll start with the LAP hack
and add support for the new CM message when/if it's there.

> >>For 
> >>example, event registration could be used to detect that a remote node 
> >>has gone down.
> >>We could use per node keep alive messages, rather than 
> >>per connection messages.
> >
> >No, these won't address cases such as DREQ timeout after remote
> >decides to close connection, without reboot.
> 
> Per node keep alive messages could.  It depends on what data is carried 
> in the message (e.g. all currently connected QPs to the node in 
> question).  I mentioned this because it may be more efficient under some 
> circumstances.

Yes. And with multiple connections per node, all the more so.
The CM message format does not seem like a good fit for this, though:
maybe some new kind of MAD?

-- 
MST



More information about the general mailing list