[openib-general] [PATCH] osm: fix segfault due to unprotected access to InformInfo DB
Eitan Zahavi
eitan at mellanox.co.il
Mon Jun 19 12:12:07 PDT 2006
Hi Hal
I have added InformInfo requests to the osmStress simulator flow.
Running it overnight exposed a bug as OpenSM segfaulted during
osm_report_notice. Some debug shows the following two flows were
missing a lock. Such that under stress the InformInfo DB was altered
while being accessed by the code in osm_report_notice.
I have verified the other flows calling osm_report_notice are under a
lock.
The fixed code is running for a while with no crash so far.
Eitan
Signed-off-by: Eitan Zahavi <eitan at mellanox.co.il>
Index: opensm/osm_state_mgr.c
===================================================================
--- opensm/osm_state_mgr.c (revision 8113)
+++ opensm/osm_state_mgr.c (working copy)
@@ -1709,6 +1709,7 @@ __osm_state_mgr_report_new_ports(
OSM_LOG_ENTER( p_mgr->p_log, __osm_state_mgr_report_new_ports );
+ CL_PLOCK_ACQUIRE( p_mgr->p_lock );
p_port =
( osm_port_t
* ) ( cl_list_remove_head( &p_mgr->p_subn->new_ports_list ) );
@@ -1759,6 +1760,7 @@ __osm_state_mgr_report_new_ports(
( osm_port_t
* ) ( cl_list_remove_head( &p_mgr->p_subn->new_ports_list ) );
}
+ CL_PLOCK_RELEASE( p_mgr->p_lock );
OSM_LOG_EXIT( p_mgr->p_log );
}
Index: opensm/osm_trap_rcv.c
===================================================================
--- opensm/osm_trap_rcv.c (revision 8113)
+++ opensm/osm_trap_rcv.c (working copy)
@@ -652,7 +652,10 @@ __osm_trap_rcv_process_request(
p_ntci->issuer_gid.unicast.interface_id = p_port->guid;
}
+ /* we need a lock here as the InformInfo DB must be stable */
+ CL_PLOCK_ACQUIRE( p_rcv->p_lock );
status = osm_report_notice(p_rcv->p_log, p_rcv->p_subn, p_ntci);
+ CL_PLOCK_RELEASE( p_rcv->p_lock );
if( status != IB_SUCCESS )
{
osm_log( p_rcv->p_log, OSM_LOG_ERROR,
More information about the general
mailing list