[ofa-general] Multicast Performance

Marcel Heinz marcel.heinz at informatik.tu-chemnitz.de
Thu May 29 07:32:53 PDT 2008


Hi,

Hal Rosenstock wrote:
> On Thu, 2008-05-29 at 15:35 +0200, Marcel Heinz wrote:
>>Hal Rosenstock wrote:
>>>On Thu, 2008-05-29 at 11:19 +0200, Marcel Heinz wrote:
>>>>Especially if I take into account
>>>>that with my own benchmark, I can get ~950MB/s when I start another
>>>>receiver on the same host as the sender. Note that both of the
>>>>receivers, the local and the remote one, are seeing all packets at that
>>>>rate, so the HCAs and the switch must be able to handle multicast
>>>>packets with this throughput.
>>>
>>>
>>>Perhaps this is a static rate issue.
>>>
>>>What SM is being used ?
>>
>>It's OpenSM 3.1.7. I had also made some tests with OpensSM 3.2.1, but
>>this didn't change anything.
> 
> 
> Can you validate either the PathRecord or MCMemberRecord returned or the
> static rate applied to the multicast QP in the various scenarios ? If it
> is the same, this is not the problem but if it's different then we're on
> to something here.
> 

This is what happened:

1. The server on host B is started and creates the MC group, OpenSM
   returns:

| May 29 15:54:34 699610 [B6D71B90] 0x08 -> MCMember Record dump:
| 				MGID....................0xff12000000000000 : 0x00010002deadbeef
| 				PortGid.................0xfe80000000000000 : 0x0002c9020025abdd
| 				qkey....................0xABCD
| 				mlid....................0xC000
| 				mtu.....................0x84
| 				TClass..................0x0
| 				pkey....................0x7FFF
| 				rate....................0x86
| 				pkt_life................0x80
| 				SLFlowLabelHopLimit.....0x0
| 				ScopeState..............0x21
| 				ProxyJoin...............0x0

2. The client on host A is started and joins to the group as
   SendOnlyNonMember, OpenSM returns:

| May 29 15:54:45 381972 [B5D6FB90] 0x08 -> MCMember Record dump:
| 				MGID....................0xff12000000000000 : 0x00010002deadbeef
| 				PortGid.................0xfe80000000000000 : 0x0002c9020025abed
| 				qkey....................0xABCD
| 				mlid....................0xC000
| 				mtu.....................0x84
| 				TClass..................0x0
| 				pkey....................0x7FFF
| 				rate....................0x86
| 				pkt_life................0x80
| 				SLFlowLabelHopLimit.....0x0
| 				ScopeState..............0x4
| 				ProxyJoin...............0x0

Now I have 255MB/s between host A and B.

3. I start another server on host A, it joines to the group and
   OpenSM returns:

| May 29 15:54:56 129971 [B6570B90] 0x08 -> MCMember Record dump:
| 				MGID....................0xff12000000000000 : 0x00010002deadbeef
| 				PortGid.................0xfe80000000000000 : 0x0002c9020025abed
| 				qkey....................0xABCD
| 				mlid....................0xC000
| 				mtu.....................0x84
| 				TClass..................0x0
| 				pkey....................0x7FFF
| 				rate....................0x86
| 				pkt_life................0x80
| 				SLFlowLabelHopLimit.....0x0
| 				ScopeState..............0x25
| 				ProxyJoin...............0x0

Now, all 3 instances measure 950MB/s throughput.

The returned MCMember Records are absolutely identical except
for the PortGid and the membership state. How can
I find out the static rate applied to the multicast QP?

Regards,
Marcel



More information about the general mailing list