[openib-general] Re: Dual Sided RMPP Support as well as OpenSM Implications

Mon Apr 10 04:18:35 PDT 2006

Hi Sean,

On Fri, 2006-04-07 at 18:17, Sean Hefty wrote:
> Hal Rosenstock wrote:
> > It can be added but may require an API change and possibly an ABI
> > change. It seems that user space code needs to both say and know whether
> > dual sided RMPP is supported or not so all mixes of user space and
> > kernel code could "work".
> 
> Do we really need to support these combinations?  

OpenSM is supporting not just OpenIB but also gen1 still. So I think
there is relevance there if that is to continue. Also, there is the
possibility of running a newer OpenSM on an older kernel which does not
support this (at least properly).

> Does anything use the GetMulti method today?

Yes but not OpenIB. Once OpenIB does support it, then the issue becomes
what happens if OpenSM/OpenIB is not being used and the other SMs lag
behind or decide not to support this.

>   Is dual-RMPP used with anything other than MultiPathRecords?

No.

> This is my understanding of what needs to happen to support dual-sided RMPP.
> 
> Node A sends an RMPP message.  This requires normal RMPP processing.
> Node A sends an ACK of the final ACK (I'll call ACK2), giving a new window.
> Node B receives ACKs.
> Node B sends the response.  This requires normal RMPP processing.
> 
>  From the perspective of node A, the RMPP code only needs to know to send ACK2. 

There's more to the state machine in turning the direction around in
terms of the sender becoming the receiver. I thought that this is the
"harder" direction change.

>   It can do this based on the method, or per transaction if directed by the client.

Yes; I was thinking of class/method based approach for this.

> Node B is more complex.  It must now wait for ACK2, using timeout and retries of 
> ACK1 until ACK2 is received.  And the response that will be generated by the 
> client must be delayed until that ACK2 is received.

Yes but isn't much of this already needed for the normal termination
case or is that part not implemented yet ?

> For node B, it may be simpler to delay handing the request up to a client until 
> ACK2 is received.

As you suggest, it makes sense not to hand it up in the dual sided case
until the turnaround ACK is received (ACK2) because if that ACK is not
received, one would want to indicate an error on the transaction (and
not send the response).

>   The only information from ACK2 that's needed when sending the 
> response is NewWindowLast.  A client could be expected to give this back to the 
> RMPP layer when sending the response.  (A client that lied about NewWindowLast 
> should only lead to sending some packets that would be dropped, with the 
> transaction aborted.)

Good idea. That would eliminate the need for some context transfer from
the receive side to the send side in the RMPP code itself.

Leaving the NWL up to the client could have the effect you mentioned but
this is known to the RMPP core and hence we needn't rely on the client
for this.

> So, if we always make the sender of an RMPP message specify NewWindowLast, with 
> a default of 1 set when the MAD is allocated, then we can keep RMPP consistent. 
>   And we'd only be left handling ACK2.

This is a clever idea. I want to think about it some more. Thanks.

-- Hal

> - Sean