[libfabric-users] fi_sockets remote CQ data ordering

Ezra Kissel ezkissel at indiana.edu
Sat Feb 27 07:21:23 PST 2016


So far so good, I can no longer reproduce the out-of-order event issue. 
  Thanks for quickly fixing.

- ezra

On 2/26/2016 6:51 PM, Jose, Jithin wrote:
> Hi Ezra,
>
> I have the patch to fix this in pr/sockets. The benchmark seems to be running fine on the ARM cluster.
> Can you verify as well?
>
> Thanks,
> - Jithin
>
>
>
>
>
>
>
> -----Original Message-----
> From: Ezra Kissel <ezkissel at indiana.edu>
> Date: Tuesday, February 23, 2016 at 5:50 PM
> To: Jithin Jose <jithin.jose at intel.com>, "libfabric-users at lists.openfabrics.org" <libfabric-users at lists.openfabrics.org>
> Subject: Re: [libfabric-users] fi_sockets remote CQ data ordering
>
>> On 2/23/2016 4:02 PM, Jose, Jithin wrote:
>>> Hi Ezra,
>>>
>>> Does multiple senders imply multiple Eps or multiple threads?
>>>
>>> If TX operations are posted to the same EP, then the completions should be in order. It might be a bug if the completions are out-of-order. It would be great if we could get a reproducer for this.
>>>
>>
>> https://github.com/disprosium8/fi_thread_test/blob/master/fi_rma_thread_rcq.c#L124
>>
>> I modified my previous rcq test to check the expected iteration encoded
>> in the immediate data.  The test is currently set to start with 2
>> sending nodes, 4k writes.
>>
>> Ranks  Senders  Bytes       Sync PUT
>> 2      2        4096        fi_rma_thread_rcq: Expecting iter 1809, got
>> 1810 from RCQ
>>
>>
>> I have only been able to reproduce this on 2-core ARMv7 nodes up to this
>> point, these boards to be specific: https://www.parallella.org/board/
>>
>> I'm queuing up a bunch of runs on x86_64 systems to see if it ever fails
>> there, but so far that identical test has been completing without
>> complications. I've tried constraining the core count for each process,
>> etc.  Any chance it's a 32b issue?  I unfortunately don't have any i386
>> nodes handy at the moment.
>>
>> I'm testing against
>> https://github.com/jithinjosepkl/libfabric/tree/pr/sockets
>>
>> - ezra



More information about the Libfabric-users mailing list