[openib-general] multicast

Sean Hefty sean.hefty at intel.com
Wed Jul 12 21:17:49 PDT 2006


>I'm concerned about how rdma_cm abstracts HCAs.  It looks like I can use
>the src_addr argument to rdma_resolve_addr() to select which IP
>address/HCA (assuming one IP per HCA), but how can I enumerate the
>available HCAs?

The HCA / RDMA device abstraction is there for device hotplug, but the verb call
to enumerate HCAs is still usable if you want to get a list of all HCAs in the
system.

You will likely have one IP address per port, rather than per HCA.  You probably
want to distinguish between locally assigned IP addresses (those given to ipoib
devices - ib0, etc.), versus multicast IP addresses, and verify that your
multicast routing tables direct traffic out of ipoib IP addresses, rather than
Ethernet IP addresses.  The IB multicast groups will base their local routing
the same as the true IP multicast groups.

>This is important for a number of reasons - one, so that I can pass on
>the available IP addresses to MPI peers out of band.  It's also
>important to know which HCA's are available in the system, and to be
>able to select which HCA to use when connecting to a peer.  This allows
>us to implement things like load balancing and failover.

HCA / port selection can be controlled by selecting a specific IP address, and
you can configure your multicast routing tables to direct traffic out any
desired port.  You should have the same control over using a specific HCA /
port; only the type of address used to identify the port changes.

I might be able to make things a little easier by adding some sort of call that
identifies all RDMA IP addresses in the system.  You could test for this today
by calling rdma_bind_addr() on all IP addresses assigned to the system.  This
doesn't really help with multicast addresses though, since you don't bind to
them...

I'm not clear on what you mean about passing available IP addresses to MPI
peers, or why it's done out of band.  Are you talking about IP addresses of the
local ipoib devices?  Multicast IP addresses?  By out of band, do you mean over
a socket, as opposed to an IB connection?

>Matt Leininger suggested looking at the IB CM as an alternative, as it
>gives more low-level control.  Am I missing something, or does the IB CM
>not handle multicast like the RDMA CM?

IB multicast groups require SA interaction, and are not associated with the IB
CM.  What control do you feel that the RDMA CM is missing?

- Sean




More information about the general mailing list