[openib-general] SA cache design

Eitan Zahavi eitan at mellanox.co.il
Fri Jan 6 00:53:10 PST 2006


Hi Sean,

Please see below.

Sean Hefty wrote:
>>* Regarding the sentence:"Clients would send their queries to the sa_cache
>>instead of the SA"
>>  I would propose that a "SA MAD send switch" be implemented in the core: Such
>>a switch
>>  will enable plugging in the SA cache (I would prefer calling it SA local
>>agent due to
>>  its extended functionality). Once plugged in, this "SA local agent" should
>>be forwarded all
>>  outgoing SA queries. Once it handles the MAD it should be able to inject the
>>response through
>>  the core "SA MAD send switch" as if they arrived from the wire.
> 
> 
> This was my thought as well.  I hesitated to refer to the cache as a local
> agent, since that's an implementation detail.  I want to allow the possibility
> for the cache to reside on another system.  For the initial implementation, the
> cache would be local however.
So if the cache  is on another host - a new kind of MAD will have to be sent on behalf of
the original request?
> 
> 
>>Functional requirements:
>>* It is clear that the first SA query to cache is PathRecord.
> 
> 
> This will be the first cached query in the initial check-in.
> 
> 
>>  So if a new client wants to connect to another node a new PathRecord
>>  query will not need to be sent to the SA. However, recent work on QoS has
>>pointed out
>>  that under some QoS schemes PathRecord should not be shared by different
>>clients
> 
> 
> I'm not sure that QoS handling is the responsibility of the cache.  The module
> requesting the path records should probably deal with this.
In IB QoS properties are mainly the PathRecord parameters: SL, Rate, MTU, PathBits (LMC bits).
So if traditionally we had PathRecord requested for each Src->Dst port now we will need to track at least:
Src->Dst * #QoS-levels. (a non optimal implementation will require even more: #Src->Dst * #Clients * #Servers * #Services).

> 
> 
>>* Forgive me for bringing the following issue - over and over to the group:
>>  Multicast Join/Leave should be reference counted. The "SA local agent" could
>>be
>>  the right place for doing this kind of reference counting (actually if it
>>does that
>>  it probably needs to be located in the Kernel - to enable cleanup after
>>killed processes).
> 
> 
> I agree that this is a problem, but I my preference would be for a dedicated
> kernel module to handle multicast join/leave requests.
Since we already sniff into the SA queries it makes sense to have the same code also handle
other functionality that requires sniffing into the SA requests.
As HAL points out this involves both ServiceRecord, Multicast Join/Leave and InformInfo requests.
Multicast Join/Leave actually behaves like a cache: if a "join" to the same MGID already took place
(no leave yet) then no need to sent the new request to the SA.
> 
> - Sean
> 




More information about the general mailing list