[openib-general] [PATCH] osm: Routing Tables are full of UNREACHABLE instead of real route

Eitan Zahavi eitan at mellanox.co.il
Sat Dec 9 06:13:01 PST 2006


Hi Sasha,

Your proposal for moving all "dump" files generation to end of sweep - 
just before "SUBNET UP" is reported - makes perfect sense to me.

But it is a bit lower in priority to the rest of the stuff.
Not sure if it worth tackling right now.

Eitan

Eitan Zahavi
Senior Engineering Director, Software Architect
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL

> -----Original Message-----
> From: Sasha Khapyorsky [mailto:sashak at voltaire.com]
> Sent: Friday, December 08, 2006 11:55 PM
> To: Eitan Zahavi
> Cc: Hal Rosenstock; Yevgeny Kliteynik; OPENIB GENERAL
> Subject: Re: [PATCH] osm: Routing Tables are full of UNREACHABLE
instead of
> real route
> 
> Hi Eitan,
> 
> On 17:12 Thu 07 Dec     , Eitan Zahavi wrote:
> > Hi Hal,
> >
> > I resolved the mystery behind the osm.fdbs that is now full of
> > UNREACHABLE instead of correct out ports.
> >
> > The problem is a consequence of the new code that does not use the
> > switch LFT blocks for the intermediate LFT assignments:
> > The idea of having incremental updates only relies on temporary
buffer
> > that the routing algorithm fills.
> > Then it is sent to the wire only if there is a diff between the
switch
> > LFT tables (from the SMDB) and the temporary buffer.
> >
> > So the switch LFT tables are not being directly updated by the
routing
> > algorithm - but only by the GetResp obtained as reply to the
setting.
> > Until this stage of the description - everything looks right.
> >
> > But what is wrong is that the dump of LFT tables is invoked before
the
> > GetResp is obtained.
> > So if only a single sweep is invoked the resulting osm.fdbs show the
> > original state of the SMDB tables whicg is full of 0xFF =
UNREACHABLE.
> 
> Right.
> 
> >
> > The patch below is taking the easy way and should be probably
revisited.
> > Instead of having a separate algorithm step for dumping out the
> > resulting GetResp data after all LFT responses were obtained it just
> > copies the sent LFT blocks to the SMDB.
> 
> Would not this be better just to move all dumps at end of the OpenSM
heavy
> sweep. This should be simple, right?
> 
> Sasha
> 
> >
> > I think we need to have at least this simple patch until we have the
> > dump move to a new algorithm step.
> >
> > Thanks
> > Eitan
> >
> > Signed-off-by:  Eitan Zahavi <eitan at mellanox.co.il>
> >
> ================================================================
> =====
> >
> > diff --git a/osm/opensm/osm_ucast_mgr.c b/osm/opensm/osm_ucast_mgr.c
> > index 5a55da8..3a62c7f 100644
> > --- a/osm/opensm/osm_ucast_mgr.c
> > +++ b/osm/opensm/osm_ucast_mgr.c
> > @@ -982,7 +982,15 @@ osm_ucast_mgr_set_fwd_table(
> >                "osm_ucast_mgr_set_fwd_table: ERR 3A05: "
> >                "Sending linear fwd. tbl. block failed (%s)\n",
> >                ib_get_err_str( status ) );
> > -    }
> > +    } else {
> > +       /*
> > +         HACK: for now we will assume we succeeded to send
> > +         and set the local DB based on it. This should allow
> > +         us to immediatly dump out our routing
> > +       */
> > +       osm_switch_set_ft_block(
> > +          p_sw, p_mgr->lft_buf + block_id_ho * 64, block_id_ho);
> > +        }
> >   }
> >
> >   OSM_LOG_EXIT( p_mgr->p_log );
> >




More information about the general mailing list