[openib-general] [PATCH] osm: Routing Tables are full of UNREACHABLE instead of real route
Sasha Khapyorsky
sashak at voltaire.com
Fri Dec 8 13:55:23 PST 2006
Hi Eitan,
On 17:12 Thu 07 Dec , Eitan Zahavi wrote:
> Hi Hal,
>
> I resolved the mystery behind the osm.fdbs that is now full of
> UNREACHABLE instead of correct out ports.
>
> The problem is a consequence of the new code that does not use the
> switch LFT blocks for the intermediate LFT assignments:
> The idea of having incremental updates only relies on temporary buffer
> that the routing algorithm fills.
> Then it is sent to the wire only if there is a diff between the switch
> LFT tables (from the SMDB) and the temporary buffer.
>
> So the switch LFT tables are not being directly updated by the routing
> algorithm - but only by the GetResp obtained as
> reply to the setting. Until this stage of the description - everything
> looks right.
>
> But what is wrong is that the dump of LFT tables is invoked before the
> GetResp is obtained.
> So if only a single sweep is invoked the resulting osm.fdbs show the
> original state of the SMDB tables whicg is full of 0xFF = UNREACHABLE.
Right.
>
> The patch below is taking the easy way and should be probably revisited.
> Instead of having a separate algorithm step for dumping out the
> resulting GetResp data after all LFT responses were obtained it just
> copies the sent LFT blocks to the SMDB.
Would not this be better just to move all dumps at end of the OpenSM
heavy sweep. This should be simple, right?
Sasha
>
> I think we need to have at least this simple patch until we have the
> dump move to a new algorithm step.
>
> Thanks
> Eitan
>
> Signed-off-by: Eitan Zahavi <eitan at mellanox.co.il>
> =====================================================================
>
> diff --git a/osm/opensm/osm_ucast_mgr.c b/osm/opensm/osm_ucast_mgr.c
> index 5a55da8..3a62c7f 100644
> --- a/osm/opensm/osm_ucast_mgr.c
> +++ b/osm/opensm/osm_ucast_mgr.c
> @@ -982,7 +982,15 @@ osm_ucast_mgr_set_fwd_table(
> "osm_ucast_mgr_set_fwd_table: ERR 3A05: "
> "Sending linear fwd. tbl. block failed (%s)\n",
> ib_get_err_str( status ) );
> - }
> + } else {
> + /*
> + HACK: for now we will assume we succeeded to send
> + and set the local DB based on it. This should allow
> + us to immediatly dump out our routing
> + */
> + osm_switch_set_ft_block(
> + p_sw, p_mgr->lft_buf + block_id_ho * 64, block_id_ho);
> + }
> }
>
> OSM_LOG_EXIT( p_mgr->p_log );
>
More information about the general
mailing list