[openib-general] Question
Hal Rosenstock
halr at voltaire.com
Mon Feb 28 15:32:05 PST 2005
On Mon, 2005-02-28 at 18:10, Ronald G. Minnich wrote:
> I do get a bunch of these:
> Bad LinearFDBTop value = 0xC000 on switch 0x2c90108d19820.
That comes from the following in osm_sw_info_rcv.c::osm_si_rcv_process
/*
Hack for bad value in Mellanox switch
*/
if( cl_ntoh16( p_si->lin_top ) > IB_LID_UCAST_END_HO )
{
osm_log( p_rcv->p_log, OSM_LOG_ERROR,
"osm_si_rcv_process: ERR 3610: "
"\n\t\t\t\tBad LinearFDBTop value = 0x%X "
"on switch 0x%" PRIx64 "."
"\n\t\t\t\tForcing correction to 0x%X.\n",
cl_ntoh16( p_si->lin_top ),
cl_ntoh64( osm_node_get_node_guid( p_node ) ),
0 );
p_si->lin_top = 0;
}
where include/iba/ib_types.h:
#define IB_LID_UCAST_END_HO 0xBFFF
So this looks like a workaround for a bug. Not sure what any of the
other symptoms are but I'm real curious now. Can someone comment more on
this ?
At a minimum, the SMA is reporting an invalid value for
PortInfo::LinearFDBTop. I wonder if it also is incapable of forwarding
DR MADs as well. That would explain this.
OpenSM needs a better way of dealing with failures. This need has been
documented. Sounds like this needs to be a high priority item. I will
start to work on this.
Thanks.
-- Hal
>
> ron
More information about the general
mailing list