[openib-general] Re: [PATCH] Opensm - race in opensm signalling

Hal Rosenstock halr at voltaire.com
Mon Oct 31 12:45:44 PST 2005


On Mon, 2005-10-31 at 08:49, Yael Kalka wrote:
> Hi Hal,
> 
> During our Windows testing we've encountered a case where for some
> reason the opensm changes the state of its port to down, and then
> brings it back up.
> After debugging it, we found out that the reason for that is a
> possible race when signaling "OSM_SIGNAL_NO_PENDING_TRANSACTIONS" to
> the osm_state_mgr_process.
> The qp0_mads_outstanding is decremented, and only later is checked if
> reaches zero. So if 2 threads decrement the qp0_mads_outstanding, and
> they are running simultanously, they can both signal
> OSM_SIGNAL_NO_PENDING_TRANSACTIONS!
> This, of course, results in a big mess in the osm_state_mgr_process
> flow.
> The following patch fixes this issue.

I did see this at staging in the Linux version of this too:
Oct 29 18:19:36 894556 [B6F63BB0] -> __osm_state_mgr_signal_error: ERR 3303: Invalid signal OSM_SIGNAL_NO_PENDING_TRANSACTIONS(3) in state OSM_SM_STATE_IDLE.

Thanks. Applied.

-- Hal




More information about the general mailing list