[openib-general] Local SA caching - why we need it

Or Gerlitz or.gerlitz at gmail.com
Wed Nov 29 12:05:59 PST 2006


On 11/28/06, Woodruff, Robert J <robert.j.woodruff at intel.com> wrote:
> I suppose we should start a new thread for this discussion.
> I have added Arlin and Sean, who have more details about the problems
> that we have seen on a 256 node cluster with connection scaling
> with OFED 1.1 and how the local sa cache helps solve the problem.
> There was already one thread on this issue on the list, but
> suppose we should have the discussion again.

> I will let Sean and Arlin provide the details of why
> an SA cache is needed to allow connection establishment scaling to very
> large clusters
> since the SA can only handle a limited number of queries per second
> and quickly becomes the bottleneck in trying to establish all-to-all
> communications for MPI or other applications need all-to-all
> communications. Intel MPI already sees this problem on a 256 node
> cluster.

As i have mentioned to you on the devcon, i think that the discussion
is not yet in a stage where Arlin has to step in and explain how udapl
work nor Sean has to step in and state how the rdma cm is implemented
with/without the local cache.

What need to be done now is to share/review the ***Intel MPI*** conn
establishment design. The developers has to come and say ***how*** do
they execute this all-to-all connection establishment on the job
start, and why they think this is the optimal way to go (first, not a
connection establishement on demand model, second why the
establishment pattern they use is optimal). Same for the Open MPI
developers, who (Galen, Jeff) also mentioned in the devcon that
consider to go for using the RDMA CM for connection establishment and
think there are issues with SA/CM scalability.

Once these two points and others that might pop during discussion are
done, we can define the problem (requirements) and seek for solutions.
These solutions might have the local sa as one building block and they
might not.

Or.




More information about the general mailing list