[openib-general] Re: Failed multicast join withnew multicast module

Sean Hefty mshefty at ichips.intel.com
Wed Jun 7 17:19:37 PDT 2006


Hal Rosenstock wrote:
>>  This 
>>leads to a race where NonMembers and SendOnlyNonMembers will fail to re-join 
>>until one of the FullMembers joins.
> 
> Might also be true with joins (not creates) from FullMembers too. I
> would presume in such cases, the join would be retried. SendOnlyMembers
> (at least for IPoIB) do this if not joined every time a packet is sent.

Correct.  But all clients trying to rejoin groups must be aware of this, and 
delay / retry until their groups are recreated.

Let me know if I'm off here, but it also appears that clients can't rely on an 
existing QP attachment or address handle to send to the new group.  Even if a 
group is re-created, there's no guarantee that the SA didn't assign a different 
MLID to the group.

So, the only safe thing to do is for all multicast clients to detach from all 
multicast groups, destroy all address handles, possibly wait for a new group to 
be created, and then start all over again.  Is this correct?

- Sean




More information about the general mailing list