[openib-general] [PATCHv2] IPoIB CM Experimental support
Michael S. Tsirkin
mst at mellanox.co.il
Mon Dec 18 06:46:23 PST 2006
> I agree that adapters that don't have SRQ can consume larger amounts of memory than those with SRQ ,however, that is not a good reason to prevent usage of RC or UC on those adapters. The memory consumption problem with any protocol not using SRQ and
> running over RC or UC is well documented.
But not solved.
> At the OpenFabrics meeting in Tampa one of several themes was that we need better IP performance to move into commercial customers and also help our current primarily HPC customers, some which are not large numbers
> of endpoints configurations. Even thought other ULP's are available, good IP is still the opportunity to getting more customers on IB.
That's why you need zero configuration setup that works well on anything
from back-to-back to 1000s of nodes. And this means code that's scalable by
design.
> Not all IB customers we have a large number of endpoint deployments so having
> non SRQ adapters use IPoIB-CM is still important to expanding the customer base
> for IB. You have to let the customer decide how they want to tune their system
> based on the available functions/features.
This just sounds too ugly. I do not *want* to special-case small clusters
precisely because this way big iron flows get no testing.
And people should not "tune" their systems just to
have them basically not run out of memory and crash.
> If not you don't have equality in
> potential performance across all HCA's.
???
It's not *practical* to require equivalent performance on all HCAs.
I just try to do the best I can, and I don't think each trade-off
needs to be turned into a confugiration option.
> Some guidance on memory consumption
> would be good, to guide users whether they want to run IPoIB-CM without SRQ just
> like IPoIB-CM will be selectable.
I still think falling back to UD mode is the right solution if HCA does not support
SRQ. I just don't see an "ignore scalability issues" option in IPoIB as being
anything but a support nightmare, and having any right to existance outside a
lab.
But - let's see this code land upstream, then code up a patch that is not ugly,
and post it. But IMO time might be better spend adding srq support in ehca.
--
MST
More information about the general
mailing list