***SPAM*** Re: [ofa-general] [PATCH 0/6] opensm: Unicast Routing Cache

Hal Rosenstock hal.rosenstock at gmail.com
Tue Oct 7 06:22:00 PDT 2008


Hi Yevgeny,

On Mon, Oct 6, 2008 at 7:09 PM, Yevgeny Kliteynik
<kliteyn at dev.mellanox.co.il> wrote:
> Hi Hal,
>
> Hal Rosenstock wrote:
>>
>> Hi Yevgeny,
>>
>> On Sun, Oct 5, 2008 at 9:26 PM, Yevgeny Kliteynik
>> <kliteyn at dev.mellanox.co.il> wrote:
>>>
>>> Hi Sasha,
>>>
>>> The following series of 6 patches implements unicast routing cache
>>> in OpenSM.
>>>
>>> This implementation (v2, previous version was sent before OFED 1.3)
>>> was rewritten from scratch:
>>>  - no caching of existing connectivity
>>>  - no caching of existing lid matrices
>>>  - each switch has an LFT buffer that contains the result of
>>>  the last routing engine execution (instead of one buffer
>>>  in ucast_mgr)
>>>  - links/ports/nodes changes are spotted during the discovery
>>>  - only the links/ports/nodes that  went down are cached
>>>  - when switch goes down, caching its lid matrices and LFT
>>>
>>> In one of the following cases we can use cached routing
>>>  - there is no topology change
>>>  - one or more CAs disappeared
>>>  - one or more leaf switches disappeared
>>> In these cases cached routing is written to the switches as is
>>> (unless the switch doesn't exist).
>>> If there is any other topology change, existing cache is invalidated
>>> and the routing engine(s) run as usual.
>>
>> Glad to see this!
>>
>> A few comments/questions:
>>
>> It seems that there is a LFT cache per switch. This seems to be a big
>> memory penalty to me (in large subnets). So I have two questions
>> related to this:
>> Can this only be done this way when cached routing is being used ?
>
> Actually, I was thinking about something else:
> Currently we have switch LFT implemented as osm_fwd_tbl_t.
> I can remove the unnecessary complexity of the osm_fwd_tbl_t by replacing
> it with a simple uint8_t array (same as LFT buffer). Then by simple
> comparison I will check whether the recently calculated LFT
> matches the switch's LFT, and if there is a match, then lft_buf
> can be freed. In this case only the switches that have LFT different
> from the recently calculated LFT will have both tables, which would be
> rare and temporary - on the next heavy sweep the LFTs would match, and
> lft_buf would be freed.

Can the forwarding tables be removed ? How would paths be
calculated/walked end to end on an SA PathRecord/MultiPathRecord query
? Would that then require query of the LFTs in the switches ?

> Effectively, it won't have memory penalty.
> It can be done in a separate patch.

I think somehow eliminating the memory penalty is important.

>> Also, when cached routing is being used, is this only needed for leaf
>> switches ?
>
> No, it is needed for all the switches, because cache can also
> handle non-leaf switch fast reset.

OK; didn't realize that but it makes sense.

>> I'm wondering when there is a cached node match whether the available
>> peer ports/neighbors are validated (or something equivalent) to know
>> caching is valid ? It might also include whether a switch is still a
>> leaf switch (which may be redundant as that should show up as a peer
>> port/neighbor change). It looks like the structure is there for this
>> but I didn't review the code in detail.
>
> If I understood your question correctly, then yes, such validation
> is done by osm_ucast_cache_validate() function.
> Can you describe in more details the case that you are asking about?

I'm just wondering about the preconditions to determine that the
cached routing for a node is valid: Is it that the current port
physical state LinkUp links are a subset of the cached ones ?

>> Are you sure all the memory allocation failures are handled properly
>> within the routing cache code ? What I mean is that NULL is returned
>> and does this always result in a caching not used/routing recalculated
>> ? Also, in that case, should some log message be indicated rather than
>> hiding this ?
>
> I will check it.

Thanks.

-- Hal

>> Nit: doc/current-routing.txt should also be updated for this feature.
>
> OK, separate patch.
>
> -- Yevgeny
>
>> -- Hal
>>
>>> The patches are:
>>>  - patch 1/6: move lft_buf from ucast_mgr to osm_switch
>>>  - patch 2/6: Add "-A" or "--ucast_cache" option to opensm
>>>  - patch 3/6: adding osm_ucast_cache.{c,h} files (this is
>>>  the cache implementation itself)
>>>  - patch 4/6: adding new cache files to makefile
>>>  - patch 5/6: integrating unicast cache into the discovery
>>>  and ucast manager
>>>  - patch 6/6: man entry for cached routing
>>>
>>> -- Yevgeny
>>> _______________________________________________
>>> general mailing list
>>> general at lists.openfabrics.org
>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>
>>> To unsubscribe, please visit
>>> http://openib.org/mailman/listinfo/openib-general
>>>
>>
>
>
>



More information about the general mailing list