[openib-general] RDMA CM multicast
Sean Hefty
sean.hefty at intel.com
Tue Jan 23 15:00:09 PST 2007
Posting to openib-general list...
>RDMA CM has multicast of course, though it seems no means of preventing
>address collisions (to me, that means two separate MPI jobs using the
>same multicast address). I know that part of the new multicast support
>you had developed a few months ago was the ability to specify a '0'
>MGID/MLID to indicate that an unused multicast address should be used
>and returned.
>
>How hard would it be to add this functionality to RDMA CM?
I looked into this, and it seems doable. I hacked the kernel rdma_cm to join a
multicast group with an mgid of 0, and it seemed to work as far as I could test
it without more extensive changes. (My test didn't actually transfer data, but
the join succeeded, the MGID/MLID was exported to userspace, and different
applications joined different groups.)
What would be needed is a way for the user to indicate that they need a unique
address. An obvious way to accomplish this is for the user to specify an IP
address of 0.0.0.0 when calling rdma_join_multicast(). The user would first
need to bind to a specific device by calling rdma_bind_addr() with a local IP
address.
If more than one group is joined this way, then rdma_leave_multicast() would
need someway to distinguish between the different groups joined by a single
user. (rdma_leave_multicast takes the IP address of the group to leave.)
Providing a "port number" with the sockaddr would work. The port number would
need to match when joining/leaving, but is not part of the multicast address,
essentially making it a join index specified by the user.
Your code would look something like this:
rdma_bind_addr(local IP address)
rdma_join_multicast(0.0.0.0, port 0) <- exchange group info out of band
rdma_join_multicast(0.0.0.0, port 1) <- exchange group info out of band
send data to a lot of nodes at once
rdma_leave_multicast(0.0.0.0, port 0)
rdma_leave_multicast(0.0.0.0, port 1)
If this sounds like it would work for you, let me know, and I can create a patch
to test this idea more.
- Sean
More information about the general
mailing list