[openib-general] Re: ib_mad.c comments

Fri Sep 10 17:12:16 PDT 2004

On Fri, 2004-09-10 at 19:00, Sean Hefty wrote:
> > > 	- Should we reference qp0 and qp1 with the registration?
> > Either way (struct ib_qp * (for qp0 or qp1) or enum ib_qp_type) is fine
> > with me. It seems to me that the enum approach is easier for the client.
> > Does the client need a pointer to the QP for something else ?
> 
> I used struct ib_qp* inside the ib_mad_agent for QP redirection purposes, 
> and allows them to query the QP for its attributes, such as the number of 
> supported SGEs.

But the client doesn't get the mad_agent pointer until it registers. 

> > > 	- The print level should be lowered from error.
> > There are 2 prints. Not sure which ones should be lowered. Both or only
> > the one after ib_post_send ?
> 
> one after ib_post_send - see below

Changed to NOTICE for normal but significant condition.

> > > 	- Has an extra '{'.
> > Where's the extra { ?
> 
> My bad - the "return j" and "}" are swapped.

No my bad... Thanks for catching this. Patch sent.

> > On the send side, it should just be the head entry. I will change that.
> > Right now on the receive side, there is one receive list, and the
> 
> I think we need separate receive lists for QP 0 and 1.  

I was thinking this (separate receive lists for QPs 0 and 1) but was not
quite there. I will now be doing this shortly. This also seems to mean a
list for every redirected QP too :-( That is a future item.

> Or the wr_id should just point to the corresponding receive information.

This would be perfect except that I think lists (or some structures) are
still needed in case the posted receives need to be returned.

> > > 	- Print level should be lowered from error
> > There are 2 prints. Not sure which ones should be lowered. Both or only
> > the one after ib_post_recv ?
> 
> one after ib_post_recv - see below

Changed to NOTICE for normal but significant condition.

> > > 	- We can track the number of posted receives to avoid posting overruns.
> > This seems like an optimization. I will put it on my TODO list for now.
> 
> It has an effect on the implementation.  If we call ib_post_send/recv until it fails, 
> then we need to treat those failures as expected, and not true errors.

E.g. To implement a deferred send when room becomes available.

> We spoke about this some yesterday, but for others on the list, I think that the current 
> implementation of ib_post_send needs to be moved down and renamed.  A call to ib_post_send 
> could then call that routine and take whatever action is appropriate to handle an overrun case, 
> such as queuing the request, ignoring the overrun, etc.

Do you mean ib_mad_post_send rather than ib_post_send ?

> > > 	- We could change to using 1 PD/device, versus 1 PD/port.
> > Is there any advantage/disadvantage one way or the other ?
> 
> Just a small optimization.  I'd ignore for now.

I'll put it in the revisit section of the TODO list.

> > > 	- Not sure if we need to support max_sge on the send queue.  This may
> > > 	  be substantially larger than what we need.  At a minimum, I think
> > > 	  that we need 2 for optimal RMPP support.  I'm not sure where the
> > > 	  trade-off is between SGE versus copying into a single buffer lies.
> > I'm not following where the minimum of 2 for optimal RMPP support comes
> > from. 
> 
> The first SGE would reference the MAD/RMPP header.  The second SGE would reference the MAD data.  
> We've seen out of memory issues in our testing due to the number of SGEs allocated per QP, 
> so limiting the QP size is probably worthwhile.  As long as a user can get the max SGEs 
> supported by the QPs, we should be okay.

I set the QP send max sge to 2. Patch sent for this.

Thanks.

-- Hal