[ofa-general] Multicast Performance

Hal Rosenstock hrosenstock at xsigo.com
Thu May 29 07:49:13 PDT 2008


Hi Marcel,

On Thu, 2008-05-29 at 16:32 +0200, Marcel Heinz wrote:
> Hi,
> 
> Hal Rosenstock wrote:
> > On Thu, 2008-05-29 at 15:35 +0200, Marcel Heinz wrote:
> >>Hal Rosenstock wrote:
> >>>On Thu, 2008-05-29 at 11:19 +0200, Marcel Heinz wrote:
> >>>>Especially if I take into account
> >>>>that with my own benchmark, I can get ~950MB/s when I start another
> >>>>receiver on the same host as the sender. Note that both of the
> >>>>receivers, the local and the remote one, are seeing all packets at that
> >>>>rate, so the HCAs and the switch must be able to handle multicast
> >>>>packets with this throughput.
> >>>
> >>>
> >>>Perhaps this is a static rate issue.
> >>>
> >>>What SM is being used ?
> >>
> >>It's OpenSM 3.1.7. I had also made some tests with OpensSM 3.2.1, but
> >>this didn't change anything.
> > 
> > 
> > Can you validate either the PathRecord or MCMemberRecord returned or the
> > static rate applied to the multicast QP in the various scenarios ? If it
> > is the same, this is not the problem but if it's different then we're on
> > to something here.
> > 
> 
> This is what happened:
> 
> 1. The server on host B is started and creates the MC group, OpenSM
>    returns:
> 
> | May 29 15:54:34 699610 [B6D71B90] 0x08 -> MCMember Record dump:
> | 				MGID....................0xff12000000000000 : 0x00010002deadbeef
> | 				PortGid.................0xfe80000000000000 : 0x0002c9020025abdd
> | 				qkey....................0xABCD
> | 				mlid....................0xC000
> | 				mtu.....................0x84
> | 				TClass..................0x0
> | 				pkey....................0x7FFF
> | 				rate....................0x86
> | 				pkt_life................0x80
> | 				SLFlowLabelHopLimit.....0x0
> | 				ScopeState..............0x21
> | 				ProxyJoin...............0x0
> 
> 2. The client on host A is started and joins to the group as
>    SendOnlyNonMember, OpenSM returns:
> 
> | May 29 15:54:45 381972 [B5D6FB90] 0x08 -> MCMember Record dump:
> | 				MGID....................0xff12000000000000 : 0x00010002deadbeef
> | 				PortGid.................0xfe80000000000000 : 0x0002c9020025abed
> | 				qkey....................0xABCD
> | 				mlid....................0xC000
> | 				mtu.....................0x84
> | 				TClass..................0x0
> | 				pkey....................0x7FFF
> | 				rate....................0x86
> | 				pkt_life................0x80
> | 				SLFlowLabelHopLimit.....0x0
> | 				ScopeState..............0x4
> | 				ProxyJoin...............0x0
> 
> Now I have 255MB/s between host A and B.
> 
> 3. I start another server on host A, it joines to the group and
>    OpenSM returns:
> 
> | May 29 15:54:56 129971 [B6570B90] 0x08 -> MCMember Record dump:
> | 				MGID....................0xff12000000000000 : 0x00010002deadbeef
> | 				PortGid.................0xfe80000000000000 : 0x0002c9020025abed
> | 				qkey....................0xABCD
> | 				mlid....................0xC000
> | 				mtu.....................0x84
> | 				TClass..................0x0
> | 				pkey....................0x7FFF
> | 				rate....................0x86
> | 				pkt_life................0x80
> | 				SLFlowLabelHopLimit.....0x0
> | 				ScopeState..............0x25
> | 				ProxyJoin...............0x0
> 
> Now, all 3 instances measure 950MB/s throughput.
> 
> The returned MCMember Records are absolutely identical except
> for the PortGid and the membership state.

Rate 0x86 is exactly 20 Gbps.

> How can I find out the static rate applied to the multicast QP?

Given the above, I don't see this as a likely suspect but you should be
able to query the ah used for sending and look in the ah_attr for
static_rate.

-- Hal

> Regards,
> Marcel




More information about the general mailing list