[openib-general] [PATCH] osm: fix a bug in ignroing pending transaction of Light Sweep

Eitan Zahavi eitan at mellanox.co.il
Sat Dec 16 10:56:39 PST 2006


Hi Hal

This patch provides fixes an issue discovered by the nightly regression.
OpenSM state machine got stack due to pending SwitchInfo transaction 
being ignored since one of the queries for SwitchInfo
failed (due to bad-link).
The patch below simply avoids aborting the wait for all SwitchInfo 
requests to return.

I think this issue might have hurt us in other situations too sine it 
aborted the wait on "CHANGE DETECTED" too.
CHANGE_DETECTED is fired on the first switch that reported "Change Bit".

It is possible that the issue is showing up as we added incremental 
support (e.g. for routing)
Since only of there are no other SMP's sent during the heavy sweep we 
will get the
"NO_PENDING_TRANSACTIONS" signal caused by the SwitchInfo requests

Eitan

Signed-off-by:  Eitan Zahavi <eitan at mellanox.co.il >

 osm/opensm/osm_state_mgr.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/osm/opensm/osm_state_mgr.c b/osm/opensm/osm_state_mgr.c
index 9eac038..91d9dbd 100644
--- a/osm/opensm/osm_state_mgr.c
+++ b/osm/opensm/osm_state_mgr.c
@@ -2075,11 +2075,10 @@ osm_state_mgr_process(
          case OSM_SIGNAL_CHANGE_DETECTED:
             /*
              * Nothing to do here.  One subnet change typcially
-             * begets another....
+             * begets another.... But needs to wait for all transactions to
+             * complete
              */
-            signal = OSM_SIGNAL_NONE;
             break;
-
          case OSM_SIGNAL_NO_PENDING_TRANSACTIONS:
             /*
              * A change was detected on the subnet.




More information about the general mailing list