[ofa-general] Re: Incorrect max_sge reported in mthca device query

Sun Apr 1 23:08:16 PDT 2007

> On Sun, 2007-04-01 at 09:43 +0300, Michael S. Tsirkin wrote:
> > > Quoting Tom Tucker <tom at opengridcomputing.com>:
> > > Subject: Incorrect max_sge reported in mthca device query
> > > 
> > > 
> > > Roland:
> > > 
> > > I think the max_sge reported by mthca_query_device is off by one. If you
> > > try to create a QP with the reported max, it fails with -EINVAL. I think
> > > the reason is that the mthca_alloc_wqe_buf function reserves a slot for
> > > a "bind request" and this pushes the WQE size over the 496B limit when
> > > the user requests the max (30) when allocating the QP.
> > > 
> > > Please let me know if I'm confused about what max_sge really means.
> > > 
> > > Thanks,
> > > Tom
> > 
> > Tom,
> > 	max_sge reported by mthca_query_device is the upper bound
> > 	for all QP types. I have not tested this, but think you can
> >        	create a UD type QP with this number of SGEs.
> > 
> > 	I'd like to add that there can be no hard guarantee that
> > 	creating a QP with a specific set of max_sge/max_wr always
> > 	succeeds even if it is within the range of values reported
> > 	by mthca_query_device: for example, for userspace QPs, the
> >        	system administrator might have limited the amount of
> > 	memory that can be locked up by these QPs, and
> > 	QP allocation requests with large max_sge/max_wr
> > 	values will always fail. There are other examples of this.
> > 	Thus, an application that wants to use as large a number of SGEs/WRs as
> > 	possible in a robust fashion currently has no other choice except
> >        	a trial and error approach, handling failures gracefully.
> > 
> > 	Finally, as a side note, it is *also* inefficient to request
> > 	allocation of more sge entries than ULP will typically
> > 	use - for reasons such as cache utilization, and many others.
> >        	How does this overhead trade-off against the need to sometimes post
> > 	multiple WRs by ULP will depend both on ULP and the hardware
> > 	used. This need to tune the ULP to a specific HCA is annoying,
> > 	and might be something that we want to try and solve at
> > 	the API level. However, max_sge/max_wr values in query device
> > 	are unlikely to be the appropriate API for this.
> > 
> > 	One way out could be to extend the API for create_qp and friends,
> > 	passing in both min and max values for some parameters,
> > 	and allowing the verbs provider to choose the optimal combination
> > 	of these. I think I floated a similiar proposal once already, but there
> > 	didn't appear to be sufficient user support for such a large API
> > 	extension.
> > 
> Quoting Tom Tucker <tom at opengridcomputing.com>:
> Subject: Re: Incorrect max_sge reported in mthca device query
> 
> Michael:
> 
> Thanks for the detail reply. 
> 
> How about if we added an interface that would treat the SGE counts/WR
> counts as "requests" and then update the qp_init_attr struct with what
> was actually created? That would allow the app to request the max, but
> "settle" for what the device was capable of at the time. 

I think that if we extend the API, we need to design it carefully
to cover as many use cases as possible.
Tom, could you explain what are you trying to do?
Why does your application need as many SGEs as possible?

Also - what about out of resources cases described above?
Would you expect the verbs API to retry the request for you?

-- 
MST