[openib-general] RDMA CM multicast

Sean Hefty sean.hefty at intel.com
Tue Jan 23 15:00:09 PST 2007


Posting to openib-general list...

>RDMA CM has multicast of course, though it seems no means of preventing
>address collisions (to me, that means two separate MPI jobs using the
>same multicast address).  I know that part of the new multicast support
>you had developed a few months ago was the ability to specify a '0'
>MGID/MLID to indicate that an unused multicast address should be used
>and returned.
>
>How hard would it be to add this functionality to RDMA CM?

I looked into this, and it seems doable.  I hacked the kernel rdma_cm to join a
multicast group with an mgid of 0, and it seemed to work as far as I could test
it without more extensive changes.  (My test didn't actually transfer data, but
the join succeeded, the MGID/MLID was exported to userspace, and different
applications joined different groups.)

What would be needed is a way for the user to indicate that they need a unique
address.  An obvious way to accomplish this is for the user to specify an IP
address of 0.0.0.0 when calling rdma_join_multicast().  The user would first
need to bind to a specific device by calling rdma_bind_addr() with a local IP
address.

If more than one group is joined this way, then rdma_leave_multicast() would
need someway to distinguish between the different groups joined by a single
user.  (rdma_leave_multicast takes the IP address of the group to leave.)
Providing a "port number" with the sockaddr would work.  The port number would
need to match when joining/leaving, but is not part of the multicast address,
essentially making it a join index specified by the user.

Your code would look something like this:

rdma_bind_addr(local IP address)
rdma_join_multicast(0.0.0.0, port 0)	<- exchange group info out of band
rdma_join_multicast(0.0.0.0, port 1)	<- exchange group info out of band
send data to a lot of nodes at once
rdma_leave_multicast(0.0.0.0, port 0)
rdma_leave_multicast(0.0.0.0, port 1)

If this sounds like it would work for you, let me know, and I can create a patch
to test this idea more.

- Sean




More information about the general mailing list