Fwd: Re: [openib-general] static LID computationwithTS_HOST_DRIVER

Michael Krause krause at cup.hp.com
Thu Sep 30 10:34:29 PDT 2004


At 07:24 PM 9/29/2004, Fab Tillier wrote:
> > From: Michael Krause [mailto:krause at cup.hp.com]
> > Sent: Wednesday, September 29, 2004 5:50 PM
> >
> > The SM is the only entity that is supposed to assign LID as well as the
> > subnet prefix.  The SM should not trust any CA / switch configuration if
> > it has not configured it thus should wipe it out and replace it with what
> > it deems best.  As for the subnet merge problem, until the M_Key is sorted
> > out, reassignment isn't an issue per se.
>
>In the case where the SM crashes or is stopped and then restated and there
>is no failover SM, resetting all LIDs seems rather drastic.  Even the case
>where the SM is stopped, upgraded, and then restarted need to account for
>situations where the fabric as configured by the previous SM, while fully
>functional, followed a different algorithm than the updated SM code.  I
>don't see how an SM can distinguish between a LID assigned by a "micro-SM"
>embedded on every host system and one assigned by a previous incarnation.
>Resetting every assigned LID just because it can't be trusted would be quite
>disruptive IMO.  If a CA/switch configuration does not cause problems, the
>SM should do its best to keep things from changing so as to minimize the
>impact of SM disruptions on overall fabric operation.

Examine the purpose and associated specification text regarding the 
M_Key.  The SM can distinguish between a locally assigned value and one it 
assigned through the use of the M_Key.  If the SM is also reasonably 
robust, it would also implement a SM database to understand what CA / 
switch exist in the fabric, how addressing was assigned, SL / VL 
arbitration, etc.  The SM is supposed to be smart and thus should enable 
recovery.

As for resetting because it cannot be trusted, that is exactly what the 
IBTA intended.  If in doubt, then reset the values so that one does not 
violate the objective of partitioning and the defined trust domains.  This 
is no different in desire than the growing use of 802.1x for Ethernet 
fabric login.  Customers want to know that the components that are 
communicating have been properly identified and configured to communicate 
within a defined partition / trust domain.  If any component is blindly 
trusted, then that puts the fabric and other components at risk.

Mike 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20040930/35b24bc9/attachment.html>


More information about the general mailing list