[ofa-general] Re: [PATCH 2/3] OpenSM: Fix incorrect reporting of routing engine/algorithm used

Sasha Khapyorsky sashak at voltaire.com
Thu Dec 13 11:23:09 PST 2007


Hi Al,

On 10:42 Thu 13 Dec     , Al Chu wrote:
> > > 
> > > There are several ways this could be fixed.  I decided to easiest was to
> > > stick a new routed_name field + lock into struct osm_routing_engine, and
> > > set/use this new field respectively.
> > 
> > Do we need special lock for this?
> 
> For this particular implementation, I added it to protect atleast the
> thread in the osm console.  The osm console does:
> 
> fprintf(out, "   Routing Engine     : %s\n",
>         p_osm->routing_engine.name ? 
>         p_osm->routing_engine.name : "null (minhop)");
> 
> So we can race while the 'name' could be changed between NULL to non-
> NULL and vice versa.

Sure. I thought about using osm.lock instead or even better sort of
lockless operation where NULL comparison is not needed.

> Another idea I had was to create an enumeration to list all of the
> possible routing algorithms that could be chosen, something like:
> 
> typedef enum {
>    OSM_ROUTING_NOT_ROUTED = 0,
>    OSM_ROUTING_MINHOP = 1,
>    OSM_ROUTING_UPDN = 2,
>    ... etc. ...
> } osm_routing_algorithm_t;
> 
> then put an osm_routing_algorithm_t variable into osm_opensm_t to
> signify what routing algorithm was last used to route the subnet.
> 
> Since no pointers are used, this method would eliminate the need to
> check for a non-null pointer.  The p_osm->lock pointer could be used as
> needed rather than creating a new lock.  It would remove a number of
> string comparisons used throughout the code.

That is great. :)

> Also, this method would be
> useful for when the routing engine chains is developed, since there will
> need to be something to indicate what routing algorithm in the chain was
> used.  All we would have to add is the enumeration plus several
> enum2string and string2enum functions.
> 
> For some reason, I believed that this would have been more invasive than
> the current patch.  But thinking about it now, it seems like it would be
> less invasive.  Would this be a preferred fix to the problem?

I agree with you, it looks like a better solution.

> > > +
> > > +        CL_PLOCK_EXCL_ACQUIRE(&p_routing_eng->routed_name_lock);
> > > +        p_routing_eng->routed_name = routed_name;
> > > +        CL_PLOCK_RELEASE(&p_routing_eng->routed_name_lock);
> > 
> > Basically this section is already protected by p_mgr->p_lock (which is
> > reference to &osm.lock).
> 
> Ahh, I wasn't aware it pointed to the same lock.

Right, it is not always easy to figure out - there is a huge pointer
duplication mess over various sub-structures in OpenSM. I think we will
need to clean it up. Hopefully after this OFED...

Sasha



More information about the general mailing list