[openib-general] [PATCH] added comments to ib_mad.h - minor update

Sat Aug 7 08:06:16 PDT 2004

On Fri, 2004-08-06 at 13:17, Sean Hefty wrote: 
> > ib_mad_send_wr.sg_list indicates first entry must reference a data
> > buffer of 256 bytes. Is the "base" RMPP header in it ? Which fields
> > must be filled in by the client (RMPPActive, RRespTime, and length) ?
> > Will length be in the first segment only (total length) or also in the
> > last segment ?
> 
> Because the spec positioned the RMPP header in the middle of user-data (i.e. after the standard MAD header), 
> I think that the most efficient way to handle sending a MAD is for the client to hand the access layer a buffer 
> that contains both the MAD header and RMPP header.  For most MADs, 
> this results in a send work request that uses a single sg-entry.

I'm unconvinced about so called "zero copy" RMPP. Someone has to do the
fragmentation/reassembly. Seems to me that should be hidden by the
access layer rather than exposed to the consumer. I think this is the
fundamental issue to resolve for RMPP. 

> If we agree on this, then the intent here is that we don't want the user handing the access layer the RMPP header 
> split across multiple data buffers.  The real restriction is that the first sg-entry should reference both the 
> MAD and RMPP headers.  I think that the user only needs to set RRespTime and RMPPActive flag in the RMPP header 
> if using RMPP.  If RMPP is not used, they should set all fields to 0.

Not sure the consumer should need to set all fields to 0 when RMPPActive
is not set. The access layer might be better to do this to be sure. 

> > On the receive side, we need to handle either if we have
> > to deal with non OpenIB implementations :-(
> 
> On the receive side, we control the data buffers, so this isn't an issue.  
> We should just post receive buffers of 256 + sizeof(grh).

I was thinking about the model where RMPP performs the coalescing on the
receive side in which case I think this helps as the segments can be
copied and reused sooner. 

> > What about subsequent entries in the s-g list for send ? Are they also
> > constrained to be 256 bytes or something else ? I would presume
> > RMPP would rewrite the RMPP header based on the first header and
> > update the appropriate fields.
> 
> We can be as flexible or restrictive as we want to be, I think.  
> My request (based on feedback) is that we try to minimize the need 
> to perform any data copies.

OK assuming this model is being used. 

> > Is timeout_ms used for Ttime when it is a RMPP send ?
> 
> timeout_ms applies to sends, whereas Ttime applies to the receiver.

Right.

> We could use the default of 40 seconds as mentioned in the spec, but this seems high to me. 

Yes, that calculation is based on a set of assumptions which are documented in the spec. 
While it is easier to use some hard coded value rather than a dynamically calculated one, 
it also lends to longer timeouts when a RMPP packet is dropped somewhere.

> For a received response, timeout_ms should work fine.
> 
> > I am still wondering about the RMPP direction switch (IsDS) and whether
> > this needs to be exposed somehow.
> 
> I don't think that it does.  

What is used to indicate send only v. send and (RMPP) response expected ? 

> I don't think anything like this was needed 
> in the sourceforge stack, and the proposed GSI implementation uses the 
> sourceforge RMPP code.  I think that we just need to know if a send requires segmentation, 
> or if a receive requires reassembly.

Did the SF RMPP use SA GetMulti which is where this is used ?

-- Hal