Hi Sasha,<br><br>

<div><span class="gmail_quote">On 7/22/07, <b class="gmail_sendername">Sasha Khapyorsky</b> <<a href="mailto:sashak@voltaire.com">sashak@voltaire.com</a>> wrote:</span></div>

<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">On 14:59 Sun 22 Jul     , Eitan Zahavi wrote: > Hi Sasha > > Let's assume someone has reset a switch on the fabric.

<br>> What would cause the SM to re-assign the LFT of that switch?<br><br>OpenSM will sweep and drop this switch and when switch will back it will<br>be initialized again. But if the reset was too fast (relative to<br>

discovery), we can be in trouble (and maybe not only with LFTs).<br><br>> I assumed that there is a mechanism to do that.<br><br>Not for "fast" switch reboot.<br><br>Hmm, I think we could try to detect this by comparing

<br>SwitchInfo:LinerFDBTop with current p_sw->max_lid_ho or even by seeing<br>that PortInfo:LID is not set.</blockquote>

<div> </div>

<div>Not sure about checking PortInfo:LID. Wouldn't that approach need to be qualified by PortState (armed or active) ? LFTTop seems better to me or perhaps a combination of the two but I may be missing something.</div>

<div><br> </div>

<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid"> Something like below:<br><br><br>diff --git a/opensm/include/opensm/osm_switch.h b/opensm/include/opensm/osm_switch.h

<br>index 5b2b19e..62c072f 100644<br>--- a/opensm/include/opensm/osm_switch.h<br>+++ b/opensm/include/opensm/osm_switch.h<br>@@ -112,6 +112,7 @@ typedef struct _osm_switch<br>       osm_fwd_tbl_t                           fwd_tbl;

<br>       osm_mcast_tbl_t                         mcast_tbl;<br>       uint32_t                                discovery_count;<br>+       unsigned                                update_ft;<br>       void                                    *priv;

<br>} osm_switch_t;<br>/*<br>@@ -152,6 +153,10 @@ typedef struct _osm_switch<br>*              during the current fabric sweep.  This number is reset<br>*              to zero at the start of a sweep.<br>*<br>+*      update_ft

<br>+*              When set fwd tables will be updated regardless to entry<br>+*              values locally stored in fwd tables images<br>+*<br>* SEE ALSO<br>*      Switch object<br>*********/<br>diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c

<br>index adece65..8bbbcac 100644<br>--- a/opensm/opensm/osm_port_info_rcv.c<br>+++ b/opensm/opensm/osm_port_info_rcv.c<br>@@ -336,6 +336,9 @@ __osm_pi_rcv_process_switch_port(<br>      break;<br>    }<br>  }<br>+  else if (port_num == 0 && p_node->sw &&

<br>+           (!p_pi->base_lid || !p_pi->master_sm_base_lid))<br>+    p_node->sw->update_ft = 1;<br><br>  /*<br>    Update the PortInfo attribute.<br>diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c

<br>index b44a3ba..03516ae 100644<br>--- a/opensm/opensm/osm_ucast_mgr.c<br>+++ b/opensm/opensm/osm_ucast_mgr.c<br>@@ -811,7 +811,8 @@ osm_ucast_mgr_set_fwd_table(<br>       osm_switch_get_fwd_tbl_block( p_sw, block_id_ho, block ) ;

<br>       block_id_ho++ )<br>  {<br>-    if (!memcmp(block, p_mgr->lft_buf + block_id_ho * 64, 64))<br>+    if (!p_sw->update_ft &&<br>+        !memcmp(block, p_mgr->lft_buf + block_id_ho * 64, 64))<br>      continue;

<br><br>    if( osm_log_is_active( p_mgr->p_log, OSM_LOG_DEBUG ) )<br>@@ -850,6 +851,7 @@ osm_ucast_mgr_set_fwd_table(<br>    }<br>  }<br><br>+  p_sw->update_ft = 0;<br>  OSM_LOG_EXIT( p_mgr->p_log );<br>}<br><br>

<br><br>BTW what do you think is the best way to detect switch power up? I<br>didn't really find a strong requirement for at powerup initialization of<br>any suitable component.</blockquote>

<div> </div>

<div>Peer switch link state change is insufficient to differentiate switch reboot from "normal" link up/down. There is no IB standard indication of this. </div>

<div><br> </div>

<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">> Anyway, kill -HUP should flush out the state and restart from scratch.<br><br>Thinking more about it I'm not sure. Similar flush will be required for

another "stored" components like pkey, sl2vl tables etc.. So it is more than just "regular" heavy sweep, another signal or option could be used for this, but OTOH it becomes very close to OpenSM restarting..

</blockquote>

<div> </div>

<div>Shouldn't this be automatic rather than requiring the admin to issue a signal somehow ?</div>

<div> </div>

<div>-- Hal</div>

<div> </div><br>

<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">Sasha<br><br>><br>><br>> Eitan<br>><br>> > -----Original Message-----<br>> > From: Sasha Khapyorsky [mailto:

<a href="mailto:sashak@voltaire.com">sashak@voltaire.com</a>]<br>> > Sent: Sunday, July 22, 2007 1:22 PM<br>> > To: Eitan Zahavi<br>> > Cc: OPENIB; <a href="mailto:hal.rosenstock@gmail.com">hal.rosenstock@gmail.com

</a>; Yevgeny Kliteynik<br>> > Subject: Re: opensm: a bug in heavy sweep? - no LFT re-configuration<br>> ><br>> > Hi Eitan,<br>> ><br>> > On 09:36 Sun 22 Jul     , Eitan Zahavi wrote:<br>> > > Hi Sasha

<br>> > ><br>> > > I am running some tests manually and apparently it looks<br>> > like I found<br>> > > a bug. Here is the sequence of things:<br>> > > 1. SM sweeps the fabric assign LFTs

<br>> > > 2. I manually modify some LFTs (single entry now marked<br>> > UNREACHABLE 3.<br>> > > I force some switch change bit to 1 or issue kill -HUP 4. The SM<br>> > > reports SUBNET UP 5. The modified LFT entry is still

<br>> > UNREACHABLE and<br>> > > the path is broken<br>> ><br>> > Right, in most cases (unless OpenSM has its own changes in<br>> > the same LFT<br>> > block) OpenSM will refer its own LFT image for  "need to update"

<br>> > decision, so _manual_ changes will not trigger new update.<br>> > Rerunning OpenSM should help however.<br>> ><br>> > > It looks to me some optimization of routing does not fully reroute

<br>> > > unless some condition is met - but that condition does not<br>> > include the<br>> > > above triggers listed in step 3.<br>> ><br>> > Rereading all fabrics LFTs by default seems to be too

<br>> > expensive operations. At least by default, if it is real<br>> > requirement this could be enforced manually, for example when<br>> > kill -HUP is used. Thoughts?<br>> ><br>> > Sasha<br>

> ><br></blockquote><br>