[ofa-general] opensm: Unsupported attribute = 0xFF02

Hal Rosenstock hrosenstock at xsigo.com
Thu Nov 1 06:24:00 PDT 2007


On Thu, 2007-11-01 at 05:56 +0200, Sasha Khapyorsky wrote:
> On 20:41 Wed 31 Oct     , Jason Gunthorpe wrote:
> > 
> > On Thu, Nov 01, 2007 at 02:24:10AM +0200, Sasha Khapyorsky wrote:
> > 
> > > What are the reasons? I think complaint SMs should be able to
> > > inter-operate, of course not in part of proprietary extensions. At least
> > > I am able to run OpenSM with Voltaire SM on one subnet.
> > 
> > At a minimum how hand off is supposed to work is very vaugely
> > specified in the IBA.
> 
> It is at least basically described in the IBA - with exchanging SMInfo.
> 
> > Besides, even if hand off wasn't a problem the two SMs would have to
> > have very similar ideas on routing, multicast, QOS, services, etc
> 
> In worst case the routing tables and QoS setups could be reconfigured
> from scratch (just as if it could be first SM run), and all SA related
> things could be rerequested with ClientReregistration bit.

As mentioned in the past, client reregistration is a rather large
hammer. There have been discussions on utilizing this mechanism in more
scenarios (which FWIW is not a good thing IMO). This approach (and it is
optional) pushes the burden back on the end nodes rather than the SM.
Scalability is certainly an issue with it. It was begrudgingly put into
the spec. It was intended only as a stopgap measure.

There was informative text put into the spec alluding to the
"appropriate" use of this option:

"A reason for the SM doing this might be that the SM suffered a failure
and as a result lost its own records of such subscriptions."
This is referring to a single SM (although that is not the recommended
deployment topology) crashing and being restarted.

IMO a civil SM would not rely on this mechanism.

-- Hal

> And sure, some configurations (partitions, QoS, routing, etc.) can be
> not synchronized for SMs, but then the differences in a fabric setups
> should be expected results.
> 
> And I'm not about "how fast and efficient it is" and even not about
> "interoperability" bugs in various implementations.
> 
> > or
> > the fabric will be badly disrupted after hand off.. Without extensions
> > to transfer this live data over before hand off it is unlikely to
> > be non-disruptive except in very constrained situations.
> > 
> > It seems to me the main benifit of the whole standardized mechanism
> > (in an interoperability context) is just to help make it so that a new
> > sm starting up doesn't just trash the fabric accidentally, and provide
> > at least some sensible behavior when two seperate subnets are combined
> > into one.
> > 
> > If you want to test hand over interop joining two operating networks
> > is a good way to do it - that is really hard to get right in all of
> > the cases :) This was the area where I felt the spec was weakest since
> > it really didn't say exactly when during the hand over exchanges each
> > SM was in control of the nodes, and exactly what should happen when
> > things go wrong was not specified..
> 
> Ok, so we are not about "impossibility" to do this... Just current lack
> of standardization makes it hard to do handover properly?
> 
> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



More information about the general mailing list