[ofa-general] [PATCH 0/6] opensm: Unicast Routing Cache
Yevgeny Kliteynik
kliteyn at dev.mellanox.co.il
Mon Oct 6 16:09:46 PDT 2008
Hi Hal,
Hal Rosenstock wrote:
> Hi Yevgeny,
>
> On Sun, Oct 5, 2008 at 9:26 PM, Yevgeny Kliteynik
> <kliteyn at dev.mellanox.co.il> wrote:
>> Hi Sasha,
>>
>> The following series of 6 patches implements unicast routing cache
>> in OpenSM.
>>
>> This implementation (v2, previous version was sent before OFED 1.3)
>> was rewritten from scratch:
>> - no caching of existing connectivity
>> - no caching of existing lid matrices
>> - each switch has an LFT buffer that contains the result of
>> the last routing engine execution (instead of one buffer
>> in ucast_mgr)
>> - links/ports/nodes changes are spotted during the discovery
>> - only the links/ports/nodes that went down are cached
>> - when switch goes down, caching its lid matrices and LFT
>>
>> In one of the following cases we can use cached routing
>> - there is no topology change
>> - one or more CAs disappeared
>> - one or more leaf switches disappeared
>> In these cases cached routing is written to the switches as is
>> (unless the switch doesn't exist).
>> If there is any other topology change, existing cache is invalidated
>> and the routing engine(s) run as usual.
>
> Glad to see this!
>
> A few comments/questions:
>
> It seems that there is a LFT cache per switch. This seems to be a big
> memory penalty to me (in large subnets). So I have two questions
> related to this:
> Can this only be done this way when cached routing is being used ?
Actually, I was thinking about something else:
Currently we have switch LFT implemented as osm_fwd_tbl_t.
I can remove the unnecessary complexity of the osm_fwd_tbl_t by replacing
it with a simple uint8_t array (same as LFT buffer). Then by simple
comparison I will check whether the recently calculated LFT
matches the switch's LFT, and if there is a match, then lft_buf
can be freed. In this case only the switches that have LFT different
from the recently calculated LFT will have both tables, which would be
rare and temporary - on the next heavy sweep the LFTs would match, and
lft_buf would be freed.
Effectively, it won't have memory penalty.
It can be done in a separate patch.
> Also, when cached routing is being used, is this only needed for leaf switches ?
No, it is needed for all the switches, because cache can also
handle non-leaf switch fast reset.
> I'm wondering when there is a cached node match whether the available
> peer ports/neighbors are validated (or something equivalent) to know
> caching is valid ? It might also include whether a switch is still a
> leaf switch (which may be redundant as that should show up as a peer
> port/neighbor change). It looks like the structure is there for this
> but I didn't review the code in detail.
If I understood your question correctly, then yes, such validation
is done by osm_ucast_cache_validate() function.
Can you describe in more details the case that you are asking about?
> Are you sure all the memory allocation failures are handled properly
> within the routing cache code ? What I mean is that NULL is returned
> and does this always result in a caching not used/routing recalculated
> ? Also, in that case, should some log message be indicated rather than
> hiding this ?
I will check it.
> Nit: doc/current-routing.txt should also be updated for this feature.
OK, separate patch.
-- Yevgeny
> -- Hal
>
>> The patches are:
>> - patch 1/6: move lft_buf from ucast_mgr to osm_switch
>> - patch 2/6: Add "-A" or "--ucast_cache" option to opensm
>> - patch 3/6: adding osm_ucast_cache.{c,h} files (this is
>> the cache implementation itself)
>> - patch 4/6: adding new cache files to makefile
>> - patch 5/6: integrating unicast cache into the discovery
>> and ucast manager
>> - patch 6/6: man entry for cached routing
>>
>> -- Yevgeny
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>
>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>>
>
More information about the general
mailing list