[ofa-general] Re: opensm: a bug in heavy sweep? - no LFT re-configuration

Mon Jul 23 08:30:31 PDT 2007

Hi Sasha,

On 7/22/07, Sasha Khapyorsky <sashak at voltaire.com> wrote:

> On 14:59 Sun 22 Jul     , Eitan Zahavi wrote:
> > Hi Sasha
> >
> > Let's assume someone has reset a switch on the fabric.
> > What would cause the SM to re-assign the LFT of that switch?
>
> OpenSM will sweep and drop this switch and when switch will back it will
> be initialized again. But if the reset was too fast (relative to
> discovery), we can be in trouble (and maybe not only with LFTs).
>
> > I assumed that there is a mechanism to do that.
>
> Not for "fast" switch reboot.
>
> Hmm, I think we could try to detect this by comparing
> SwitchInfo:LinerFDBTop with current p_sw->max_lid_ho or even by seeing
> that PortInfo:LID is not set.

Not sure about checking PortInfo:LID. Wouldn't that approach need to be
qualified by PortState (armed or active) ? LFTTop seems better to me or
perhaps a combination of the two but I may be missing something.

> Something like below:
>
>
> diff --git a/opensm/include/opensm/osm_switch.h
> b/opensm/include/opensm/osm_switch.h
> index 5b2b19e..62c072f 100644
> --- a/opensm/include/opensm/osm_switch.h
> +++ b/opensm/include/opensm/osm_switch.h
> @@ -112,6 +112,7 @@ typedef struct _osm_switch
>        osm_fwd_tbl_t                           fwd_tbl;
>        osm_mcast_tbl_t                         mcast_tbl;
>        uint32_t                                discovery_count;
> +       unsigned                                update_ft;
>        void                                    *priv;
> } osm_switch_t;
> /*
> @@ -152,6 +153,10 @@ typedef struct _osm_switch
> *              during the current fabric sweep.  This number is reset
> *              to zero at the start of a sweep.
> *
> +*      update_ft
> +*              When set fwd tables will be updated regardless to entry
> +*              values locally stored in fwd tables images
> +*
> * SEE ALSO
> *      Switch object
> *********/
> diff --git a/opensm/opensm/osm_port_info_rcv.c
> b/opensm/opensm/osm_port_info_rcv.c
> index adece65..8bbbcac 100644
> --- a/opensm/opensm/osm_port_info_rcv.c
> +++ b/opensm/opensm/osm_port_info_rcv.c
> @@ -336,6 +336,9 @@ __osm_pi_rcv_process_switch_port(
>       break;
>     }
>   }
> +  else if (port_num == 0 && p_node->sw &&
> +           (!p_pi->base_lid || !p_pi->master_sm_base_lid))
> +    p_node->sw->update_ft = 1;
>
>   /*
>     Update the PortInfo attribute.
> diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
> index b44a3ba..03516ae 100644
> --- a/opensm/opensm/osm_ucast_mgr.c
> +++ b/opensm/opensm/osm_ucast_mgr.c
> @@ -811,7 +811,8 @@ osm_ucast_mgr_set_fwd_table(
>        osm_switch_get_fwd_tbl_block( p_sw, block_id_ho, block ) ;
>        block_id_ho++ )
>   {
> -    if (!memcmp(block, p_mgr->lft_buf + block_id_ho * 64, 64))
> +    if (!p_sw->update_ft &&
> +        !memcmp(block, p_mgr->lft_buf + block_id_ho * 64, 64))
>       continue;
>
>     if( osm_log_is_active( p_mgr->p_log, OSM_LOG_DEBUG ) )
> @@ -850,6 +851,7 @@ osm_ucast_mgr_set_fwd_table(
>     }
>   }
>
> +  p_sw->update_ft = 0;
>   OSM_LOG_EXIT( p_mgr->p_log );
> }
>
>
>
> BTW what do you think is the best way to detect switch power up? I
> didn't really find a strong requirement for at powerup initialization of
> any suitable component.

Peer switch link state change is insufficient to differentiate switch reboot
from "normal" link up/down. There is no IB standard indication of this.

> > Anyway, kill -HUP should flush out the state and restart from scratch.
>
> Thinking more about it I'm not sure. Similar flush will be required for
> another "stored" components like pkey, sl2vl tables etc.. So it is more
> than just "regular" heavy sweep, another signal or option could be used
> for this, but OTOH it becomes very close to OpenSM restarting..

Shouldn't this be automatic rather than requiring the admin to issue a
signal somehow ?

-- Hal

Sasha
>
> >
> >
> > Eitan
> >
> > > -----Original Message-----
> > > From: Sasha Khapyorsky [mailto:sashak at voltaire.com]
> > > Sent: Sunday, July 22, 2007 1:22 PM
> > > To: Eitan Zahavi
> > > Cc: OPENIB; hal.rosenstock at gmail.com; Yevgeny Kliteynik
> > > Subject: Re: opensm: a bug in heavy sweep? - no LFT re-configuration
> > >
> > > Hi Eitan,
> > >
> > > On 09:36 Sun 22 Jul     , Eitan Zahavi wrote:
> > > > Hi Sasha
> > > >
> > > > I am running some tests manually and apparently it looks
> > > like I found
> > > > a bug. Here is the sequence of things:
> > > > 1. SM sweeps the fabric assign LFTs
> > > > 2. I manually modify some LFTs (single entry now marked
> > > UNREACHABLE 3.
> > > > I force some switch change bit to 1 or issue kill -HUP 4. The SM
> > > > reports SUBNET UP 5. The modified LFT entry is still
> > > UNREACHABLE and
> > > > the path is broken
> > >
> > > Right, in most cases (unless OpenSM has its own changes in
> > > the same LFT
> > > block) OpenSM will refer its own LFT image for  "need to update"
> > > decision, so _manual_ changes will not trigger new update.
> > > Rerunning OpenSM should help however.
> > >
> > > > It looks to me some optimization of routing does not fully reroute
> > > > unless some condition is met - but that condition does not
> > > include the
> > > > above triggers listed in step 3.
> > >
> > > Rereading all fabrics LFTs by default seems to be too
> > > expensive operations. At least by default, if it is real
> > > requirement this could be enforced manually, for example when
> > > kill -HUP is used. Thoughts?
> > >
> > > Sasha
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070723/3562da27/attachment.html>