<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3132" name=GENERATOR></HEAD>
<BODY>
<DIV><SPAN class=872535617-23072007><FONT face="Palatino Linotype"
color=#0000ff><STRONG>Hi Sasha, Hal,</STRONG></FONT></SPAN></DIV>
<DIV><SPAN class=872535617-23072007><STRONG><FONT face="Palatino Linotype"
color=#0000ff></FONT></STRONG></SPAN> </DIV>
<DIV><SPAN class=872535617-23072007><STRONG><FONT face="Palatino Linotype"
color=#0000ff>I think I have an idea:</FONT></STRONG></SPAN></DIV>
<DIV><SPAN class=872535617-23072007><STRONG><FONT face="Palatino Linotype"
color=#0000ff></FONT></STRONG></SPAN> </DIV>
<DIV><SPAN class=872535617-23072007><STRONG><FONT face="Palatino Linotype"
color=#0000ff>Since this is a specific switch that reported ChangeBit or Trap
why can't we just qualify that there was no change in the switch
setup?</FONT></STRONG></SPAN></DIV>
<DIV><SPAN class=872535617-23072007><STRONG><FONT face="Palatino Linotype"
color=#0000ff>We could send PortInfo, SwitchInfo, LFT, MFT, SL2VL, VLArb, PKey
queries and make sure no change from previous state. Or we could simply enforce
last state by sending it over again ...</FONT></STRONG></SPAN></DIV>
<DIV> </DIV><!-- Converted from text/rtf format -->
<P><SPAN lang=en-gb><B><I><FONT face="Monotype Corsiva" color=#0000ff
size=6>Eitan Zahavi</FONT></I></B><I></I></SPAN> <BR><SPAN lang=en-gb><FONT
face=Tahoma size=2>Senior Engineering Director, Software Architect</FONT></SPAN>
<BR><SPAN lang=en-gb><FONT face=Tahoma size=2>Mellanox Technologies
LTD</FONT></SPAN> <BR><SPAN lang=en-gb><FONT face=Tahoma
size=2>Tel:+972-4-9097208<BR>Fax:+972-4-9593245</FONT></SPAN> <BR><SPAN
lang=en-gb><FONT face=Tahoma size=2>P.O. Box 586 Yokneam 20692
ISRAEL</FONT></SPAN> </P>
<DIV> </DIV><BR>
<BLOCKQUOTE
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> Hal Rosenstock
[mailto:hal.rosenstock@gmail.com] <BR><B>Sent:</B> Monday, July 23, 2007 6:31
PM<BR><B>To:</B> Sasha Khapyorsky<BR><B>Cc:</B> Eitan Zahavi; OPENIB; Yevgeny
Kliteynik<BR><B>Subject:</B> Re: opensm: a bug in heavy sweep? - no LFT
re-configuration<BR></FONT><BR></DIV>
<DIV></DIV>Hi Sasha,<BR><BR>
<DIV><SPAN class=gmail_quote>On 7/22/07, <B class=gmail_sendername>Sasha
Khapyorsky</B> <<A
href="mailto:sashak@voltaire.com">sashak@voltaire.com</A>>
wrote:</SPAN></DIV>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">On
14:59 Sun 22 Jul , Eitan Zahavi wrote:<BR>> Hi
Sasha<BR>><BR>> Let's assume someone has reset a switch on the fabric.
<BR>> What would cause the SM to re-assign the LFT of that
switch?<BR><BR>OpenSM will sweep and drop this switch and when switch will
back it will<BR>be initialized again. But if the reset was too fast
(relative to<BR>discovery), we can be in trouble (and maybe not only with
LFTs).<BR><BR>> I assumed that there is a mechanism to do
that.<BR><BR>Not for "fast" switch reboot.<BR><BR>Hmm, I think we could try
to detect this by comparing <BR>SwitchInfo:LinerFDBTop with current
p_sw->max_lid_ho or even by seeing<BR>that PortInfo:LID is not
set.</BLOCKQUOTE>
<DIV> </DIV>
<DIV>Not sure about checking PortInfo:LID. Wouldn't that approach need to be
qualified by PortState (armed or active) ? LFTTop seems better to me or
perhaps a combination of the two but I may be missing something.</DIV>
<DIV><BR> </DIV>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">Something
like below:<BR><BR><BR>diff --git a/opensm/include/opensm/osm_switch.h
b/opensm/include/opensm/osm_switch.h <BR>index 5b2b19e..62c072f
100644<BR>--- a/opensm/include/opensm/osm_switch.h<BR>+++
b/opensm/include/opensm/osm_switch.h<BR>@@ -112,6 +112,7 @@ typedef struct
_osm_switch<BR>
osm_fwd_tbl_t
fwd_tbl; <BR>
osm_mcast_tbl_t
mcast_tbl;<BR>
uint32_t discovery_count;<BR>+
unsigned update_ft;<BR>
void *priv;
<BR>} osm_switch_t;<BR>/*<BR>@@ -152,6 +153,10 @@ typedef struct
_osm_switch<BR>* during
the current fabric sweep. This number is
reset<BR>* to
zero at the start of a
sweep.<BR>*<BR>+* update_ft
<BR>+* When
set fwd tables will be updated regardless to
entry<BR>+* values
locally stored in fwd tables images<BR>+*<BR>* SEE
ALSO<BR>* Switch
object<BR>*********/<BR>diff --git a/opensm/opensm/osm_port_info_rcv.c
b/opensm/opensm/osm_port_info_rcv.c <BR>index adece65..8bbbcac 100644<BR>---
a/opensm/opensm/osm_port_info_rcv.c<BR>+++
b/opensm/opensm/osm_port_info_rcv.c<BR>@@ -336,6 +336,9 @@
__osm_pi_rcv_process_switch_port(<BR> break;<BR> }<BR> }<BR>+ else
if (port_num == 0 && p_node->sw &&
<BR>+
(!p_pi->base_lid ||
!p_pi->master_sm_base_lid))<BR>+ p_node->sw->update_ft
= 1;<BR><BR> /*<BR> Update the PortInfo
attribute.<BR>diff --git a/opensm/opensm/osm_ucast_mgr.c
b/opensm/opensm/osm_ucast_mgr.c <BR>index b44a3ba..03516ae 100644<BR>---
a/opensm/opensm/osm_ucast_mgr.c<BR>+++ b/opensm/opensm/osm_ucast_mgr.c<BR>@@
-811,7 +811,8 @@
osm_ucast_mgr_set_fwd_table(<BR>
osm_switch_get_fwd_tbl_block( p_sw, block_id_ho, block ) ;
<BR> block_id_ho++
)<BR> {<BR>- if (!memcmp(block,
p_mgr->lft_buf + block_id_ho * 64, 64))<BR>+ if
(!p_sw->update_ft
&&<BR>+ !memcmp(block,
p_mgr->lft_buf + block_id_ho * 64,
64))<BR> continue;
<BR><BR> if( osm_log_is_active( p_mgr->p_log,
OSM_LOG_DEBUG ) )<BR>@@ -850,6 +851,7 @@
osm_ucast_mgr_set_fwd_table(<BR> }<BR> }<BR><BR>+ p_sw->update_ft
= 0;<BR> OSM_LOG_EXIT( p_mgr->p_log );<BR>}<BR><BR><BR><BR>BTW
what do you think is the best way to detect switch power up? I<BR>didn't
really find a strong requirement for at powerup initialization of<BR>any
suitable component.</BLOCKQUOTE>
<DIV> </DIV>
<DIV>Peer switch link state change is insufficient to differentiate switch
reboot from "normal" link up/down. There is no IB standard indication of this.
</DIV>
<DIV><BR> </DIV>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">>
Anyway, kill -HUP should flush out the state and restart from
scratch.<BR><BR>Thinking more about it I'm not sure. Similar flush will be
required for <BR>another "stored" components like pkey, sl2vl tables etc..
So it is more<BR>than just "regular" heavy sweep, another signal or option
could be used<BR>for this, but OTOH it becomes very close to OpenSM
restarting.. </BLOCKQUOTE>
<DIV> </DIV>
<DIV>Shouldn't this be automatic rather than requiring the admin to issue a
signal somehow ?</DIV>
<DIV> </DIV>
<DIV>-- Hal</DIV>
<DIV> </DIV><BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">Sasha<BR><BR>><BR>><BR>>
Eitan<BR>><BR>> > -----Original Message-----<BR>> > From:
Sasha Khapyorsky [mailto: <A
href="mailto:sashak@voltaire.com">sashak@voltaire.com</A>]<BR>> >
Sent: Sunday, July 22, 2007 1:22 PM<BR>> > To: Eitan Zahavi<BR>>
> Cc: OPENIB; <A
href="mailto:hal.rosenstock@gmail.com">hal.rosenstock@gmail.com </A>;
Yevgeny Kliteynik<BR>> > Subject: Re: opensm: a bug in heavy sweep? -
no LFT re-configuration<BR>> ><BR>> > Hi Eitan,<BR>>
><BR>> > On 09:36 Sun 22 Jul , Eitan Zahavi
wrote:<BR>> > > Hi Sasha <BR>> > ><BR>> > > I am
running some tests manually and apparently it looks<BR>> > like I
found<BR>> > > a bug. Here is the sequence of things:<BR>> >
> 1. SM sweeps the fabric assign LFTs <BR>> > > 2. I manually
modify some LFTs (single entry now marked<BR>> > UNREACHABLE
3.<BR>> > > I force some switch change bit to 1 or issue kill -HUP
4. The SM<BR>> > > reports SUBNET UP 5. The modified LFT entry is
still <BR>> > UNREACHABLE and<BR>> > > the path is
broken<BR>> ><BR>> > Right, in most cases (unless OpenSM has its
own changes in<BR>> > the same LFT<BR>> > block) OpenSM will
refer its own LFT image for "need to update" <BR>> >
decision, so _manual_ changes will not trigger new update.<BR>> >
Rerunning OpenSM should help however.<BR>> ><BR>> > > It
looks to me some optimization of routing does not fully reroute <BR>>
> > unless some condition is met - but that condition does not<BR>>
> include the<BR>> > > above triggers listed in step 3.<BR>>
><BR>> > Rereading all fabrics LFTs by default seems to be too
<BR>> > expensive operations. At least by default, if it is
real<BR>> > requirement this could be enforced manually, for example
when<BR>> > kill -HUP is used. Thoughts?<BR>> ><BR>> >
Sasha<BR>> ><BR></BLOCKQUOTE><BR></BLOCKQUOTE></BODY></HTML>