[openib-general] FW: [PATCH] [RFC] librdmacm: expose device list to users

Tue Jul 25 17:07:11 PDT 2006

Sean Hefty wrote:
> The IB SA does something like this.  If a user creates a multicast group 
> with an MGID of 0, the SA will assign an MGID to the group.  That MGID 
> then somehow needs to be distributed to everyone wanting to join that 
> group.

That's exactly what I want, as long as it's certain the assigned MGID is 
not already in use.

> I'm not aware of any equivalent functionality for IP multicast groups, 
> so I'm not sure if it makes sense to try to provide this functionality 
> through the RDMA CM or expose a raw IB multicast interface.

I'd prefer a raw IB interface - like you said, this isn't really 
analogous to IP, and I'd like to avoid the other non-multicast issue I 
have with RDMA CM.  Also, when I first started looking at IB multicast I 
was expecting this to be part of the ibverbs interface, not a CM.

>>> I would then use ObtainMulticastAddress, then pass the returned address
>>> via OOB to all the peers I want to be in that multicast group.  All the
>>> peers would then use JoinMulticastGroup.
> 
> 
> In order to use what's there, is there any way that the processes can 
> create unique addresses to use?  Maybe map the server port numbers into 
> the address?

Not sure I understand what you're asking.. addresses to use with what?

> Your interface looks good.  I just need to think more about the details 
> of what approach to use.  I want to find the balance between being easy 
> to use, but provide the necessary capabilities.

Always good :)

> I like the idea that the group is automatically deleted when the last 
> user leaves, since this matches with the IB implementation.  We might 
> also be able to remove the ObtainMulticastAddress, by letting 
> JoinMulticastGroup take a wildcard address.  Can you take a look at the 
> kernel interface in ib_multicast.h, and let me know if exposing that to 
> userspace would work for you?

Part of why I defined ReturnMulticastAddress the way I did was because I 
thought it would be useful to hold on to multicast groups without having 
any peers joined.  These could be kept in a pool for re-use, and have 
peers join/leave them as needed.  The MVAPICH group wrote a paper on a 
similar idea, where they keep a pool of groups with all peers joined, 
then any peers not interested in communication when a group is pulled 
from the pool can pull out.  But if the time cost is in the join and not 
the initial creation, this doesn't solve anything.

ib_multicast.h looks good.. lots of functionality packed into very few 
functions.  I don't see any problems with it... yet :)

I like the callback on join completion, as opposed to polling somewhere.

The comments don't say anything about passing an MGID of 0 in - I assume 
this functionality will be there.  Would I pass an MLID of 0 as well, or 
do I need to come up with a valid MLID from somewhere?

Just to make sure, if I pass in an MGID of 0, an MGID will not only be 
allocated, but joined as well?

Again to be clear, ib_free_multicast() will leave the multicast group in 
question?

Is ib_get_mcmember_rec the interface you mentioned for determining 
whether a port is already in a multicast group?

Just thought of a feature that would be nice.  As-is, I have no idea 
when all peers intending to join a multicast group have done so.  What 
would be nice is some sort of notification mechanism - say the ability 
to provide a callback that is called each time a peer joins a multicast 
group.  I already know which peers I expect to join, so I can keep a 
list of which ones have/haven't joined, and mark the multicast group as 
useable when all the expected peers are joined.

Would this be reasonable?  The alternative for me would be for each peer 
to send messages OOB to every other peer in the multicast group when it 
has successfully joined.

Andrew