Fwd: Re: [openib-general] static LID computationwithTS_HOST_DRIVER
Michael Krause
krause at cup.hp.com
Thu Sep 30 10:34:29 PDT 2004
At 07:24 PM 9/29/2004, Fab Tillier wrote:
> > From: Michael Krause [mailto:krause at cup.hp.com]
> > Sent: Wednesday, September 29, 2004 5:50 PM
> >
> > The SM is the only entity that is supposed to assign LID as well as the
> > subnet prefix. The SM should not trust any CA / switch configuration if
> > it has not configured it thus should wipe it out and replace it with what
> > it deems best. As for the subnet merge problem, until the M_Key is sorted
> > out, reassignment isn't an issue per se.
>
>In the case where the SM crashes or is stopped and then restated and there
>is no failover SM, resetting all LIDs seems rather drastic. Even the case
>where the SM is stopped, upgraded, and then restarted need to account for
>situations where the fabric as configured by the previous SM, while fully
>functional, followed a different algorithm than the updated SM code. I
>don't see how an SM can distinguish between a LID assigned by a "micro-SM"
>embedded on every host system and one assigned by a previous incarnation.
>Resetting every assigned LID just because it can't be trusted would be quite
>disruptive IMO. If a CA/switch configuration does not cause problems, the
>SM should do its best to keep things from changing so as to minimize the
>impact of SM disruptions on overall fabric operation.
Examine the purpose and associated specification text regarding the
M_Key. The SM can distinguish between a locally assigned value and one it
assigned through the use of the M_Key. If the SM is also reasonably
robust, it would also implement a SM database to understand what CA /
switch exist in the fabric, how addressing was assigned, SL / VL
arbitration, etc. The SM is supposed to be smart and thus should enable
recovery.
As for resetting because it cannot be trusted, that is exactly what the
IBTA intended. If in doubt, then reset the values so that one does not
violate the objective of partitioning and the defined trust domains. This
is no different in desire than the growing use of 802.1x for Ethernet
fabric login. Customers want to know that the components that are
communicating have been properly identified and configured to communicate
within a defined partition / trust domain. If any component is blindly
trusted, then that puts the fabric and other components at risk.
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20040930/35b24bc9/attachment.html>
More information about the general
mailing list