[ofa-general] Multicast Performance
Hal Rosenstock
hrosenstock at xsigo.com
Thu May 29 07:49:13 PDT 2008
Hi Marcel,
On Thu, 2008-05-29 at 16:32 +0200, Marcel Heinz wrote:
> Hi,
>
> Hal Rosenstock wrote:
> > On Thu, 2008-05-29 at 15:35 +0200, Marcel Heinz wrote:
> >>Hal Rosenstock wrote:
> >>>On Thu, 2008-05-29 at 11:19 +0200, Marcel Heinz wrote:
> >>>>Especially if I take into account
> >>>>that with my own benchmark, I can get ~950MB/s when I start another
> >>>>receiver on the same host as the sender. Note that both of the
> >>>>receivers, the local and the remote one, are seeing all packets at that
> >>>>rate, so the HCAs and the switch must be able to handle multicast
> >>>>packets with this throughput.
> >>>
> >>>
> >>>Perhaps this is a static rate issue.
> >>>
> >>>What SM is being used ?
> >>
> >>It's OpenSM 3.1.7. I had also made some tests with OpensSM 3.2.1, but
> >>this didn't change anything.
> >
> >
> > Can you validate either the PathRecord or MCMemberRecord returned or the
> > static rate applied to the multicast QP in the various scenarios ? If it
> > is the same, this is not the problem but if it's different then we're on
> > to something here.
> >
>
> This is what happened:
>
> 1. The server on host B is started and creates the MC group, OpenSM
> returns:
>
> | May 29 15:54:34 699610 [B6D71B90] 0x08 -> MCMember Record dump:
> | MGID....................0xff12000000000000 : 0x00010002deadbeef
> | PortGid.................0xfe80000000000000 : 0x0002c9020025abdd
> | qkey....................0xABCD
> | mlid....................0xC000
> | mtu.....................0x84
> | TClass..................0x0
> | pkey....................0x7FFF
> | rate....................0x86
> | pkt_life................0x80
> | SLFlowLabelHopLimit.....0x0
> | ScopeState..............0x21
> | ProxyJoin...............0x0
>
> 2. The client on host A is started and joins to the group as
> SendOnlyNonMember, OpenSM returns:
>
> | May 29 15:54:45 381972 [B5D6FB90] 0x08 -> MCMember Record dump:
> | MGID....................0xff12000000000000 : 0x00010002deadbeef
> | PortGid.................0xfe80000000000000 : 0x0002c9020025abed
> | qkey....................0xABCD
> | mlid....................0xC000
> | mtu.....................0x84
> | TClass..................0x0
> | pkey....................0x7FFF
> | rate....................0x86
> | pkt_life................0x80
> | SLFlowLabelHopLimit.....0x0
> | ScopeState..............0x4
> | ProxyJoin...............0x0
>
> Now I have 255MB/s between host A and B.
>
> 3. I start another server on host A, it joines to the group and
> OpenSM returns:
>
> | May 29 15:54:56 129971 [B6570B90] 0x08 -> MCMember Record dump:
> | MGID....................0xff12000000000000 : 0x00010002deadbeef
> | PortGid.................0xfe80000000000000 : 0x0002c9020025abed
> | qkey....................0xABCD
> | mlid....................0xC000
> | mtu.....................0x84
> | TClass..................0x0
> | pkey....................0x7FFF
> | rate....................0x86
> | pkt_life................0x80
> | SLFlowLabelHopLimit.....0x0
> | ScopeState..............0x25
> | ProxyJoin...............0x0
>
> Now, all 3 instances measure 950MB/s throughput.
>
> The returned MCMember Records are absolutely identical except
> for the PortGid and the membership state.
Rate 0x86 is exactly 20 Gbps.
> How can I find out the static rate applied to the multicast QP?
Given the above, I don't see this as a likely suspect but you should be
able to query the ah used for sending and look in the ah_attr for
static_rate.
-- Hal
> Regards,
> Marcel
More information about the general
mailing list