[ofa-general] [PATCH 0/6] opensm: Unicast Routing Cache

Yevgeny Kliteynik kliteyn at dev.mellanox.co.il
Mon Oct 6 16:09:46 PDT 2008


Hi Hal,

Hal Rosenstock wrote:
> Hi Yevgeny,
> 
> On Sun, Oct 5, 2008 at 9:26 PM, Yevgeny Kliteynik
> <kliteyn at dev.mellanox.co.il> wrote:
>> Hi Sasha,
>>
>> The following series of 6 patches implements unicast routing cache
>> in OpenSM.
>>
>> This implementation (v2, previous version was sent before OFED 1.3)
>> was rewritten from scratch:
>>  - no caching of existing connectivity
>>  - no caching of existing lid matrices
>>  - each switch has an LFT buffer that contains the result of
>>   the last routing engine execution (instead of one buffer
>>   in ucast_mgr)
>>  - links/ports/nodes changes are spotted during the discovery
>>  - only the links/ports/nodes that  went down are cached
>>  - when switch goes down, caching its lid matrices and LFT
>>
>> In one of the following cases we can use cached routing
>>  - there is no topology change
>>  - one or more CAs disappeared
>>  - one or more leaf switches disappeared
>> In these cases cached routing is written to the switches as is
>> (unless the switch doesn't exist).
>> If there is any other topology change, existing cache is invalidated
>> and the routing engine(s) run as usual.
> 
> Glad to see this!
> 
> A few comments/questions:
> 
> It seems that there is a LFT cache per switch. This seems to be a big
> memory penalty to me (in large subnets). So I have two questions
> related to this:
> Can this only be done this way when cached routing is being used ?

Actually, I was thinking about something else:
Currently we have switch LFT implemented as osm_fwd_tbl_t.
I can remove the unnecessary complexity of the osm_fwd_tbl_t by replacing
it with a simple uint8_t array (same as LFT buffer). Then by simple
comparison I will check whether the recently calculated LFT
matches the switch's LFT, and if there is a match, then lft_buf
can be freed. In this case only the switches that have LFT different
from the recently calculated LFT will have both tables, which would be
rare and temporary - on the next heavy sweep the LFTs would match, and
lft_buf would be freed.
Effectively, it won't have memory penalty.
It can be done in a separate patch.

> Also, when cached routing is being used, is this only needed for leaf switches ?

No, it is needed for all the switches, because cache can also
handle non-leaf switch fast reset.

> I'm wondering when there is a cached node match whether the available
> peer ports/neighbors are validated (or something equivalent) to know
> caching is valid ? It might also include whether a switch is still a
> leaf switch (which may be redundant as that should show up as a peer
> port/neighbor change). It looks like the structure is there for this
> but I didn't review the code in detail.

If I understood your question correctly, then yes, such validation
is done by osm_ucast_cache_validate() function.
Can you describe in more details the case that you are asking about?

> Are you sure all the memory allocation failures are handled properly
> within the routing cache code ? What I mean is that NULL is returned
> and does this always result in a caching not used/routing recalculated
> ? Also, in that case, should some log message be indicated rather than
> hiding this ?

I will check it.

> Nit: doc/current-routing.txt should also be updated for this feature.

OK, separate patch.

-- Yevgeny

> -- Hal
> 
>> The patches are:
>>  - patch 1/6: move lft_buf from ucast_mgr to osm_switch
>>  - patch 2/6: Add "-A" or "--ucast_cache" option to opensm
>>  - patch 3/6: adding osm_ucast_cache.{c,h} files (this is
>>   the cache implementation itself)
>>  - patch 4/6: adding new cache files to makefile
>>  - patch 5/6: integrating unicast cache into the discovery
>>   and ucast manager
>>  - patch 6/6: man entry for cached routing
>>
>> -- Yevgeny
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>
>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>>
> 





More information about the general mailing list