[openib-general] Re: Failed multicast join withnew multicast module

Hal Rosenstock halr at voltaire.com
Fri Jun 9 13:12:28 PDT 2006


On Fri, 2006-06-09 at 12:46, Sean Hefty wrote:
> Hal Rosenstock wrote:
> > Note the MGRPs are MGIDs and switches are programmed with MLIDs and
> > these can be 1:1 or many:1 depending on the implementation. Most do not
> > do the many:1 but this is allowed by the spec. Also, note that switches
> > know nothing about the groups themselves (only MLIDs and which ports) so
> > most of the information is in the SM.
> 
> Is there any chance that someone using an "old" join can receive data on a group 
> that was created after an SM restart?  

I think so. One can also view this as another aspect of lazy deletion.
Actually the deletion can be so slow as to never occur.

> My guess is that the QP would discard the 
> message unless both the MLID and MGIDs matched, 

That would be my guess too but I'm not sure.

> so there's probably not a real issue here.

> > How ? Not all the group information is in the switches.
> 
> It's likely that the end nodes have the mcmember records from previous joins. 
> Isn't that along with the switch information enough to reconstruct the group 
> information?

No. The MCMemberRecord joins don't match the MGIDs to the MLIDs. You
would need more info than that although it is available.

The other issue is whether you trust the state of the network or not
when the SM comes up. That's sometimes a dangerous proposition.

-- Hal

> - Sean





More information about the general mailing list