[openib-general] Re: ib_mad.c comments

Sean Hefty mshefty at ichips.intel.com
Fri Sep 10 16:00:19 PDT 2004


On Fri, 10 Sep 2004 17:43:36 -0400
Hal Rosenstock <halr at voltaire.com> wrote:

> > 	- Need to support registrations for "all" methods of a given class.
> > 	  (We may want the initial implementation to only do this for now,
> > 	  to shorten the development time.)
> This can be done today depending on the definition of "all". All is the
> entire method bit mask. If "all" means all the ones valid for a class, 
> that is a specific bit mask per class. Is this proposing a shortcut way
> to do this ? If so, is this much of a savings for the client ?

>From the client's perspective, I don't think it's a big deal.  I was thinking more about the implementation and avoiding checking on the method if someone wanted all MADs for a given class.  This may not be much of a savings - not sure how fast the method checks would be.
 
> Do you think it adds that much more complexity to have multiple clients
> for methods of the same class ? Wouldn't this be a problem to run SM and
> SMA concurrently if just all methods for a class was supported initially
> ?

Since you already have an implementation that operates on the methods, I wouldn't change it.

> > 	- Should we reference qp0 and qp1 with the registration?
> Either way (struct ib_qp * (for qp0 or qp1) or enum ib_qp_type) is fine
> with me. It seems to me that the enum approach is easier for the client.
> Does the client need a pointer to the QP for something else ?

I used struct ib_qp* inside the ib_mad_agent for QP redirection purposes, and allows them to query the QP for its attributes, such as the number of supported SGEs.

> > 	- The print level should be lowered from error.
> There are 2 prints. Not sure which ones should be lowered. Both or only
> the one after ib_post_send ?

one after ib_post_send - see below
 
> > 	- Has an extra '{'.
> Where's the extra { ?

My bad - the "return j" and "}" are swapped.

> On the send side, it should just be the head entry. I will change that.
> Right now on the receive side, there is one receive list, and the

I think we need separate receive lists for QP 0 and 1.  Or the wr_id should just point to the corresponding receive information.

> > 	- Print level should be lowered from error
> There are 2 prints. Not sure which ones should be lowered. Both or only
> the one after ib_post_recv ?

one after ib_post_recv - see below
 
> > 	- We can track the number of posted receives to avoid posting overruns.
> This seems like an optimization. I will put it on my TODO list for now.

It has an effect on the implementation.  If we call ib_post_send/recv until it fails, then we need to treat those failures as expected, and not true errors.

We spoke about this some yesterday, but for others on the list, I think that the current implementation of ib_post_send needs to be moved down and renamed.  A call to ib_post_send could then call that routine and take whatever action is appropriate to handle an overrun case, such as queuing the request, ignoring the overrun, etc.

> > 	- We could change to using 1 PD/device, versus 1 PD/port.
> Is there any advantage/disadvantage one way or the other ?

Just a small optimization.  I'd ignore for now.
 
> > 	- Not sure if we need to support max_sge on the send queue.  This may
> > 	  be substantially larger than what we need.  At a minimum, I think
> > 	  that we need 2 for optimal RMPP support.  I'm not sure where the
> > 	  trade-off is between SGE versus copying into a single buffer lies.
> I'm not following where the minimum of 2 for optimal RMPP support comes
> from. 

The first SGE would reference the MAD/RMPP header.  The second SGE would reference the MAD data.  We've seen out of memory issues in our testing due to the number of SGEs allocated per QP, so limiting the QP size is probably worthwhile.  As long as a user can get the max SGEs supported by the QPs, we should be okay.



More information about the general mailing list