[openib-general] multicast code/merge status

Sean Hefty mshefty at ichips.intel.com
Wed Jan 10 09:40:34 PST 2007


> OK, I understand that adding a send only join param changes the 
> librdmacm/ucma ABI and further that you might be somehow busy to fully 
> implement the sendonly scheme at the multicast code for the 2.6.21 time 
> frame.
> 
> How about adding sendonly param to the ABI and having the ucma kernel 
> code returning -EINVAL if someone tries to set it to true. Such code can 
> be pushed to 2.6.21 and when you have the time to complete the 
> implementation you can complete this?

I don't think adding this is a huge deal; I just haven't gotten to it yet. 
However, I'd like to make sure there's enough time once the change is made to 
verify that we have the right result before pushing it upstream.

> Is it what Dotan has reported? i recall the test does not use librdmacm 
> nor IPoIB, so how does it exercise the kernel ib_sa api at all ??? i 
> guess it uses libibmad or libibumad to send the joins etc.

Woody has also seen this issue.  And of course, I can't reproduce it on my 
systems, but I'm actively looking into the problem.  It looks like some sort of 
issue with ipoib trying to join a non-existent multicast group.

> Looking on the code, i understand that if an multicast consumer attempts 
> to join a group for which another consumer is already joined then it 
> just gets the group params, that is the mgid is your discriminator (with 
> the exception of an all zeros mgid which has a different treatment) 
> which makes much sense to me.

Not exactly.  The rdma_cm consumer gets the group parameters for the ipoib 
broadcast group.  It uses this information as a template for joining new groups.

> Going forward with this idea, a cma consumer that wants to use the ipv4 
> broadcast group qkey can join the group and learn the qkey.

One issue is that an rdma_cm consumer can first allocate a UD QP to use with UD 
traffic.  When it later joins a multicast group, the qkey must be the same.  How 
does ipoib handle this?

> Since for our apps needs we do intend to join the 224.0.0.1 group, 
> resolving a) above is fine for us --> we will join 224.0.0.1 above, 
> provide the qkey to the rdma cm and it will join to the other group (eg 
> 224.5.5.5) with this qkey.
> 
> what do you think?

I'm not completely following you on this yet.

>>     } else {
>>         memset(rec, 0, sizeof *rec);
>>         ib_get_cached_gid(device, port_num, 0, &rec->port_gid);
>>         rec->pkey = 0xFFFF;
>>         get_random_bytes(&rec->qkey, sizeof rec->qkey);
>>         rec->join_state = 1;
> 
> 
> can you remind me what the idea/trick here, aren't you supposed to 
> generate an mgid for this case?

This either returns an existing MCMemberRecord that this node has joined, or it 
fills out an MCMemberRecord that can be used to join a new group.  If the mgid 
is zero, the SA will assign one.

- Sean




More information about the general mailing list