[ofa-general] Re: opensm: a bug in heavy sweep? - no LFT re-configuration

Sasha Khapyorsky sashak at voltaire.com
Sun Jul 22 03:22:09 PDT 2007


Hi Eitan,

On 09:36 Sun 22 Jul     , Eitan Zahavi wrote:
> Hi Sasha
> 
> I am running some tests manually and apparently it looks like 
> I found a bug. Here is the sequence of things:
> 1. SM sweeps the fabric assign LFTs  
> 2. I manually modify some LFTs (single entry now marked UNREACHABLE
> 3. I force some switch change bit to 1 or issue kill -HUP
> 4. The SM reports SUBNET UP
> 5. The modified LFT entry is still UNREACHABLE and the path is broken

Right, in most cases (unless OpenSM has its own changes in the same LFT
block) OpenSM will refer its own LFT image for  "need to update"
decision, so _manual_ changes will not trigger new update. Rerunning
OpenSM should help however.

> It looks to me some optimization of routing does not fully reroute
> unless some condition is met - but that condition does not include the
> above triggers listed in step 3.

Rereading all fabrics LFTs by default seems to be too expensive
operations. At least by default, if it is real requirement this could be
enforced manually, for example when kill -HUP is used. Thoughts?

Sasha



More information about the general mailing list