[openib-general] SA cache design

Mon Jan 16 12:27:45 PST 2006

Eitan Zahavi wrote:
> [EZ] The scalability issues we see today are what I most worry about.

I think that we have a couple scalability issues at the core of this problem.  I 
think that a cache can solve part of the problem, but to fully address the 
issues, we eventually may need to extend our APIs and underlying protocols.

One issue that I see is that the CMA, IB CM, and DAPL APIs support only 
point-to-point connections.  Trying to layer a many-to-many connection model 
over these is leading to the inefficiencies.  For example, the CMA generates one 
SA query per connection.  Another issue is that even if if the number of queries 
were reduced, the fabric will still see O(n^2) connection messages.

Based on the code, the only SA query of interest to most users will be a path 
record query by gids/pkey.  To speed up applications written to the current CMA, 
DAPL, and Intel's MPI (hey, I gotta eat), my actual implementation has a very 
limited path record cache in the kernel.  The cache uses an index with O(1) 
insertion, removal, and retrieval.  (I plan on re-using the index to help 
improve the performance of the IB CM as well.)

I'm still working on ideas to address the many-to-many connection model.  One 
idea is to have a centralized connection manager to coordinate the connections 
between the various endpoints.  The drawback is that this requires defining a 
proprietary protocol.  Any implementation work in this area will be deferred for 
now though.

- Sean