[ofa-general] Re: Re: IPoIB-CM UC mode

Michael S. Tsirkin mst at dev.mellanox.co.il
Tue Jul 3 10:23:12 PDT 2007


> Quoting Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [ofa-general] Re: Re: IPoIB-CM UC mode
> 
> >That it's a general IB problem and should be addressed at IB level.
> >Which it seems to be - with CM.
> 
> I understand the simplicity that using LAP for an out-of-band keep alive 
> message can give you, but that's not the intent of the message.

I guess so - but even if the responder happens to do a modify QP as a result,
and erroneously responds with APR, that's not too bad.

> (You 
> could also use REQ/REJ or SIDR REQ/SIDR REP messages for this carrying 
> the right private data...)

Hmm, I don't see how REQ gives you data on existing connection. Further,
this would need a spec extension to define private data format then?
LAP trick works out of the box ...

> If we don't want to require apps to send in-band keep alive messages, 
> then I think we should explore all potential out-of-band solutions.

I actually think a single working solution is enough.
No need to explore all of them :).

> For 
> example, event registration could be used to detect that a remote node 
> has gone down.
> We could use per node keep alive messages, rather than 
> per connection messages.

No, these won't address cases such as DREQ timeout after remote
decides to close connection, without reboot.

> We could add a new out-of-band keep alive 
> Or clearly define that LAP is the preferred way of for all 
> connections to do keep alives.

Sure, someone might need to talk at IBTA about these clarifications.

-- 
MST



More information about the general mailing list