[ofa-general] Re: [PATCH] opensm/opensm/osm_state_mgr.c: Add check for valid physical port before using pointer.
Ira Weiny
weiny2 at llnl.gov
Tue Nov 18 14:06:08 PST 2008
I am not sure this will fix my bug.
The stack trace in my bug ended with:
#0 osm_vendor_get (h_bind=0x0, mad_size=256, p_vw=0x69bbe8) at
The h_bind was being extracted from the osm_physp_t object. Would this fix
ensure that the h_bind pointer was valid in the osm_physp_t object returned?
I used the "osm_physp_is_valid" function because the port_guid in osm_physp_t
object was only set after port_info returned valid data which would also ensure
that h_bind was set up correctly. That happens through the call path:
osm_pi_rcv_process->osm_physp_init->osm_dr_path_init
Currently osm_node_t->node_info is set when __osm_ni_rcv_process_new calls
osm_node_new.
osm_node_new calls osm_node_init_physp->osm_physp_init->osm_dr_path_init; but
only on the portnum which in the NodeInfo SMP. Perhaps osm_physp_init needs to
be called again as in this patch:
diff --git a/opensm/opensm/osm_node_info_rcv.c b/opensm/opensm/osm_node_info_rcv.c
index 20b16d1..5749a66 100644
--- a/opensm/opensm/osm_node_info_rcv.c
+++ b/opensm/opensm/osm_node_info_rcv.c
@@ -785,6 +785,9 @@ __osm_ni_rcv_process_existing(IN osm_sm_t * sm,
break;
}
+ p_node->node_info = *p_ni;
+ osm_node_init_physp(p_node, p_madw);
+
__osm_ni_rcv_set_links(sm, p_node, port_num, p_ni_context);
OSM_LOG_EXIT(sm->p_log);
Thoughts?
Ira
On Tue, 18 Nov 2008 14:30:00 +0200
Sasha Khapyorsky <sashak at voltaire.com> wrote:
> Hi,
>
> On 20:54 Wed 12 Nov , Sasha Khapyorsky wrote:
> > >
> > > I was wondering if it would return invalid ports ever. It would be easy for it
> > > to return only valid ports but perhaps that should be another function to
> > > preserve functionality?
>
> Looked at this. Another problematic place where this function is used is
> osm_sa_link_record.c - there when "any" port becomes invalid (which is
> possible case) it starts an endless recursion :(. So we will need to fix
> the function behavior.
>
> One option is to scan all ports and to return valid one. Another solution
> would be to update locally stored in OpenSM NodeInfo on each receive
> (something like below). Then osm_node_get_any_physp_ptr() will return a
> port where this node was accessed last.
>
> In this way it also could catch potential OtherLocalSetting changes (in
> NodeInfo, such as SystemImageGUID, etc.).
>
> Could anybody see any downsides with such approach?
>
> Sasha
>
>
> diff --git a/opossum/opensm/osm_node_info_rcv.c b/opensm/opensm/osm_node_info_rcv.c
> index 20b16d1..7d41cab 100644
> --- a/opensm/opensm/osm_node_info_rcv.c
> +++ b/opensm/opensm/osm_node_info_rcv.c
> @@ -785,6 +785,8 @@ __osm_ni_rcv_process_existing(IN osm_sm_t * sm,
> break;
> }
>
> + p_node->node_info = *p_ni;
> +
> __osm_ni_rcv_set_links(sm, p_node, port_num, p_ni_context);
>
> OSM_LOG_EXIT(sm->p_log);
More information about the general
mailing list