[ofa-general] iWARP peer-to-peer revised proposal

Kanevsky, Arkady Arkady.Kanevsky at netapp.com
Tue Dec 11 14:07:16 PST 2007


Goal is to have something implemented now in IW_CM to solve interop 
issues which we can use as a starting point to submisison to IETF MPA 
"extension".

The proposal is for an initiator side to generate the first message
(RTU)
in RDDP mode.
RTU stands for the 3rd MPA message - Ready to Use.
 
Initiator side:
1. IW_CM sends first MPA message (request).
 
Option A: no change for MPA request.
Option B: Steal a bit from reserve field to indicate that Initiator 
"supports" peer-to-peer model and wants to use it.
The default value is 0 indicating that Initiator does
not support peer-to-peer model which is the same as current MPA format.
Value 1 indicates support.
Option C: The same as option B but steal a bit from private data for it
instead of reserved field.

[For the quick fix Option A is the easiest. For the MPA
extension proposal Option B looks bette.]

2. Responder side
MPA response indicates whether or not Responder can handle RTU message 
and what type of message RTU should be.
 
Option A: Steal a couple of bits from reserve field.
It will encode the format of the RTU message it can handle.
Default 0 value is that it can NOT handle RTU message.
(This means that Initiator must send 1st message as part of ULP to ULP 
traffic). The other 3 values represent 0b RR, 0b RW, untagged 0b Send.
Option B: the same as above but steal bits from private data instead of
reserved field.
 
3. ULP must use the same post to Send Queue model as is for IB.
That is no posting to SQ till you get connection setup completion event.
(I am not aware of any ULP which does not do that now.
If you do, please, indicate it.)
 
4. Initiator (IW_CM):
Send RTU message based on the MPA response.
Either:
A. 0B RR (signalled?)
B. 0B RW (unsignalled?)
C. untagged 0B Send (unsignalled?)
 
If RTU was send signalled then IW_CM generates Connection Established 
event when it reaps completion.
If RTU was send unsignalled then Connection Established event 
generated when post returns successfully.
[I prefer unsignalled to avoid contaminating the CQ which can be 
shared]
 
5. Responder:
responder does _not_ emit any FPDUs until it received RTU.
There is not need to dictate how it is done under the covers.

One way of doing it is for iWARP vendor driver/FW/HW does not generate
Connection established 
event when MPA response is send.
Instead it waits for the RTU message to arrive and handles its under 
the cover (vendor magic) and generates Connection Established event 
when it does. 
It may temporarily use RR or RW credits or slot in "under covers" CQ.

Another, implementation _could_ pass up the ESTABLISHED event before the
RTU arrives but stall the SQ until it does arrive...

6. While MPA response message does not have timeout parameter
the RDMA_CM connect_accept message will have tiemout parameter
if RTU message was requested. If RTU message does not arrive within
the timeout the RDMA connection is teared down.
Currently, RDMA_CM has default timeout value for IB. I suggest
we keep that as the default value for MPA response.
It corresponds to the user specified timeout value of 0.

7. For OFA interop lets agree what the RTU message type will be.
My recommendation is unsignalled 0b Read.

Arkady Kanevsky                     email: arkady at netapp.com
Network Appliance Inc.              phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.        Fax: 781-895-1195
Waltham, MA 02451                   central phone: 781-768-5300



More information about the general mailing list