[openib-general] [PATCH] osm: Routing Tables are full of UNREACHABLE instead of real route
Eitan Zahavi
eitan at mellanox.co.il
Sat Dec 9 06:13:01 PST 2006
Hi Sasha,
Your proposal for moving all "dump" files generation to end of sweep -
just before "SUBNET UP" is reported - makes perfect sense to me.
But it is a bit lower in priority to the rest of the stuff.
Not sure if it worth tackling right now.
Eitan
Eitan Zahavi
Senior Engineering Director, Software Architect
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL
> -----Original Message-----
> From: Sasha Khapyorsky [mailto:sashak at voltaire.com]
> Sent: Friday, December 08, 2006 11:55 PM
> To: Eitan Zahavi
> Cc: Hal Rosenstock; Yevgeny Kliteynik; OPENIB GENERAL
> Subject: Re: [PATCH] osm: Routing Tables are full of UNREACHABLE
instead of
> real route
>
> Hi Eitan,
>
> On 17:12 Thu 07 Dec , Eitan Zahavi wrote:
> > Hi Hal,
> >
> > I resolved the mystery behind the osm.fdbs that is now full of
> > UNREACHABLE instead of correct out ports.
> >
> > The problem is a consequence of the new code that does not use the
> > switch LFT blocks for the intermediate LFT assignments:
> > The idea of having incremental updates only relies on temporary
buffer
> > that the routing algorithm fills.
> > Then it is sent to the wire only if there is a diff between the
switch
> > LFT tables (from the SMDB) and the temporary buffer.
> >
> > So the switch LFT tables are not being directly updated by the
routing
> > algorithm - but only by the GetResp obtained as reply to the
setting.
> > Until this stage of the description - everything looks right.
> >
> > But what is wrong is that the dump of LFT tables is invoked before
the
> > GetResp is obtained.
> > So if only a single sweep is invoked the resulting osm.fdbs show the
> > original state of the SMDB tables whicg is full of 0xFF =
UNREACHABLE.
>
> Right.
>
> >
> > The patch below is taking the easy way and should be probably
revisited.
> > Instead of having a separate algorithm step for dumping out the
> > resulting GetResp data after all LFT responses were obtained it just
> > copies the sent LFT blocks to the SMDB.
>
> Would not this be better just to move all dumps at end of the OpenSM
heavy
> sweep. This should be simple, right?
>
> Sasha
>
> >
> > I think we need to have at least this simple patch until we have the
> > dump move to a new algorithm step.
> >
> > Thanks
> > Eitan
> >
> > Signed-off-by: Eitan Zahavi <eitan at mellanox.co.il>
> >
> ================================================================
> =====
> >
> > diff --git a/osm/opensm/osm_ucast_mgr.c b/osm/opensm/osm_ucast_mgr.c
> > index 5a55da8..3a62c7f 100644
> > --- a/osm/opensm/osm_ucast_mgr.c
> > +++ b/osm/opensm/osm_ucast_mgr.c
> > @@ -982,7 +982,15 @@ osm_ucast_mgr_set_fwd_table(
> > "osm_ucast_mgr_set_fwd_table: ERR 3A05: "
> > "Sending linear fwd. tbl. block failed (%s)\n",
> > ib_get_err_str( status ) );
> > - }
> > + } else {
> > + /*
> > + HACK: for now we will assume we succeeded to send
> > + and set the local DB based on it. This should allow
> > + us to immediatly dump out our routing
> > + */
> > + osm_switch_set_ft_block(
> > + p_sw, p_mgr->lft_buf + block_id_ho * 64, block_id_ho);
> > + }
> > }
> >
> > OSM_LOG_EXIT( p_mgr->p_log );
> >
More information about the general
mailing list