[openib-general] IPoIB broadcast MC group membership

Fabian Tillier ftillier at silverstorm.com
Wed Feb 22 09:53:34 PST 2006


On 2/22/06, Greg Lindahl <lindahl at pathscale.com> wrote:
>
> On Tue, Feb 21, 2006 at 11:40:53PM -0800, Fabian Tillier wrote:
>
> > You'd have to make the group 1X.  Note that the group being 1X doesn't
> > limit unicast traffic to 1X rates, since the rate for unicast traffic
> > would be set based on the rate reported in the path records for the
> > various endpoints.
> >
> > So 4X SDR and 4X DDR nodes would have to set their inter-packet delay
> > for the broadcast group to end up with a 1X packet injection rate.
>
> So, basically, MVAPICH doesn't have code that does either the group
> creation properly when there is a mixture of HCA bandwidths, or limit
> the packet injection rate. And IPoIB could violate this rule depending
> on how user programs use it, e.g. if I did a lot of broadcasting, I
> could easily exceed 1X's bandwidth.
>
> So this is more than just a "fix OpenSM" issue. It's more of a "fix
> the spec" issue, if I'm understanding it correctly.

No, the spec is fine.  This is a "fix the SW" issue.  If OpenSM
rejected join requests of nodes for which the MC group is unrealizable
(that is, some setting of the requestor conflict with the existing
group, such as the rate), such nodes would not be able to join the
broadcast group and thus not have IPoIB connectivity.

When the SA responds to the MC join request, the response includes the
rate.  The recipient of the response should create an address vector
for the MC group that takes the rate into account, which would cause
the hardware to honor the injection rate such as to not flood the
group.  I haven't looked at MVAPICH, so I can't tell you if what it
does is correct.  IPoIB does seem to do the right thing, though.

- Fab



More information about the general mailing list