[openib-general] Plans for libibverbs 1.0, 1.1 and beyond

Caitlin Bestler caitlinb at broadcom.com
Wed Feb 22 15:12:01 PST 2006


openib-general-bounces at openib.org wrote:
>     Gil> Roland, I believe we should add support for a resize WQ
>     Gil> command (as a part of modify QP) to enable changing the WQ
>     Gil> size.  On a very large scale cluster, with many operating
>     Gil> QPs, the work queue memory consumption might be
>     Gil> expansive. Thus the MPI implementation should tradeoff for
>     Gil> pipelining requests vs. WQ memory consumption. The resize WQ
>     Gil> will allow on-demand adaptive WQ setting instead of static
>     Gil> allocation of the memory resource, which I believe can
>     Gil> increase performance and save memory at the same time.
> 
> Does Mellanox HW support resizing a WQ after a QP is created?
>  If so would you be willing to contribute an implementation?
> 

This is an API question, not an implementation question. We can
reasonably anticipate that a) some devices could actually implement
a work queue resize and in doing so free on-chip resources, and
b) that some devices the only resources that would be freed
by changing work queue sizes would be on the host (and that
the synchronization required would typically not justify the
benefit of releasing space for work requests alone).

The resource in question (the work queues) are inherently
device specific. An implementation that resizes Brand X 
work queues is of no real benefit to any other device.

We need to distinquish between the two rationale for changing
work queue sizes: reducing resource usage, and seeking hardware
assist in enforcing ULP constraints.

Freeing resources for real is trickier, and could easily be
something best done at an opportunistic time (such as the
next time that the work queue wraps around to its base).
While any resizing that is supposed to result in a throttling
effect should take effect (at least logically) immediately.

So the real question is whether there are devices that need
resize support to truly allow adjustments to on-chip resources,
and secondly whether applications should be given the expectation
that resizing will a) release on-chip resources such that they
can be used for another QP and b) will actually throttle
applications if the new limits are exceeded.




More information about the general mailing list