[openib-general] Re: Failed multicast join withnew multicast module

Hal Rosenstock halr at voltaire.com
Fri Jun 9 06:52:02 PDT 2006


On Fri, 2006-06-09 at 06:43, Hal Rosenstock wrote:
> On Thu, 2006-06-08 at 18:00, Sean Hefty wrote:
> > Hal Rosenstock wrote:
> > > 2. There is lazy deletion of MC groups allowed so the reclamation may be
> > > difficult.
> > 
> > I'm not familiar with the switch programming.
> 
> Note the MGRPs are MGIDs and switches are programmed with MLIDs and
> these can be 1:1 or many:1 depending on the implementation. Most do not
> do the many:1 but this is allowed by the spec. Also, note that switches
> know nothing about the groups themselves (only MLIDs and which ports) so
> most of the information is in the SM.
> 
> > Does the SM set the entire 
> > MulticastForwardingTable for a switch every time a new group is created, or a 
> > new member joins?
> 
> No. It only needs to program the affected block(s) of the MFT based on
> the MLID and the portmask (ports for replication).
> 
> > If the SM loses track of all multicast groups, how are the 
> > stale groups on the switches deleted?
> 
> There are different strategies for dealing with this. It could clear out
> all the MFTs in all the switches but that is expensive. It could also
> wait for multicast registrations and then program the needed MFT blocks
> in the affected switches only caring about those. In this case, packets
> on those MLIDs would still be forwarded until the MLID is reclaimed.
> 
> > > The endport SMAs are claiming they do support client reregistration but
> > > it does take more than that for the endport/node to behave properly.
> > 
> > My original plan was to have the ib_multicast module rejoin all groups, but 
> > since the MLIDs can change I can't see any way to handle reregistration safely 
> > without involving the application.
> 
> Because the application needs to modify the QP for this ? As I said, I'm
> not sure IPoIB was handling this before. I'm sure Roland knows for sure.

It does look to me like the pre multicast module IPoIB does leave and
then rejoin on receipt of a client reregister from the SM.

-- Hal

> > My latest changes are just to report errors 
> > on existing multicast groups on an affected port.
> 
> How ?
> 
> > > I know it is a conceptual rather than actual compliance. One issue would
> > > be defining what it means to repect all existing communication. Then we
> > > would need to look at whether that was feasible or not and perhaps
> > > rescope what it means to a set of things achievable. Another issue would
> > > be defining where it is possible or not. If that is totally vendor
> > > dependent, then this would have no substance to it. It is largely a
> > > matter of being a "better" SM.
> > 
> > We could use the phrase, "except where such communication is no longer 
> > realizable" instead of "where possible".  Where unrealizable means impossible 
> > because the communication uses properties that are physically impossible to 
> > achieve given the hardware configuration of the subnet.  (See bottom of page 910 
> > of the spec.)
> 
> That specific text is defined there for the case of unrealizable joins
> which is very different from the case being discussed. The specific
> property mismatches are listed. Still not sure what determines this in
> the case we are discussing. 
> 
> > If an SM could just query switches for their MulticastForwardingTables or the 
> > end nodes,
> 
> It can.
> 
> >  would we be able to avoid these issues?
> 
> How ? Not all the group information is in the switches.
> 
> -- Hal
> 
> > - Sean
> 
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 





More information about the general mailing list