[libfabric-users] sockets provider, number of outstanding reads

Tue Feb 9 22:09:21 PST 2016

Hi Ezra,

Thanks for the reproducer, it helped in identifying the issue.
There was an issue in the provider that prevented the RMA-ack control messages to handled, when flooded with RMA ops.

This pull-request should fix the issue:
https://github.com/ofiwg/libfabric/pull/1734

I had tried the benchmark with this fix, and it worked fine for me. Can you verify as well?

Thanks,
- Jithin

-----Original Message-----
From: Ezra Kissel <ezkissel at indiana.edu>
Date: Tuesday, February 9, 2016 at 2:20 PM
To: Jithin Jose <jithin.jose at intel.com>, "Hefty, Sean" <sean.hefty at intel.com>, "libfabric-users at lists.openfabrics.org" <libfabric-users at lists.openfabrics.org>
Subject: Re: [libfabric-users] sockets provider, number of outstanding reads

>I was able to extract enough code from my larger library to reproduce 
>the issue I'm facing, and it's hopefully concise enough for others to 
>debug.
>
>https://github.com/disprosium8/fi_thread_test
>
>You'll need MPI and a couple nodes to test on.
>
>There's no apparent problem when there's a single sender (see outer for 
>loop in test), but introducing multiple senders, all reading the same 
>buffer from each peer, seems to create some issue in the provider.  I'll 
>see if I can narrow down what's actually causing it to spin yet.
>
>- ezra
>
>On 2/9/2016 1:04 PM, Jose, Jithin wrote:
>> There are no specific limits for outstanding number of reads in sockets-provider.
>> A reproducer will definitely help here.
>>
>> - Jithin
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: "Hefty, Sean" <sean.hefty at intel.com>
>> Date: Tuesday, February 9, 2016 at 9:54 AM
>> To: Ezra Kissel <ezkissel at indiana.edu>, "libfabric-users at lists.openfabrics.org" <libfabric-users at lists.openfabrics.org>, Jithin Jose <jithin.jose at intel.com>
>> Subject: RE: [libfabric-users] sockets provider, number of outstanding reads
>>
>>> Copying Jithin, who maintains the sockets provider, in case he's not subscribed to this list.  This sounds like it may be exposing a bug in the provider.  Is the source code of the test available somewhere?
>>>
>>>
>>>> I recently ran into this behavior using the sockets provider in a
>>>> threaded environment, using FI_RMA.  My example attempts to post N
>>>> number of fi_reads to get a registered source buffer into a registered
>>>> destination buffer.  This example is a simple benchmark so I'm not
>>>> worried about the integrity of the destination buffer.  Let's say N=256,
>>>> so for each message size, I'll do 256 fi_reads for each message size up
>>>> to a max of 1MB, waiting for N completions after each iteration.  Base
>>>> pointers for each N remain the same.
>>>>
>>>> This works great up to around 256K size reads.  After that point I will
>>>> no longer get local completions and the example hangs.  There are two
>>>> things I can do to get it to work reliably:
>>>>
>>>> 1) Simply increase the destination buffer size to msg_size*N, but still
>>>> issuing reads at the same base offset works.  Incrementing the
>>>> destination offset by msg_size for each N also works in this case, as
>>>> expected.
>>>>
>>>> 2) Limit the posting rate of reads for larger sizes.  I check to make
>>>> sure I don't exceed fi_tx_size_left() but I'll have to throttle the rate
>>>> significantly to ensure progress, I think it was around 32MB "in-flight"
>>>> before it was stable on my system.
>>>>
>>>> I don't see this behavior with psm provider, or with fi_writes.
>>>> Question is, is there some read limit I need to observe that I'm missing
>>>> or is there some issue with lots of outstanding socket reads?
>>>