[libfabric-users] CQ permission denied (-EACCES)

Jose, Jithin jithin.jose at intel.com
Thu Feb 11 09:15:59 PST 2016


-----Original Message-----
From: Ezra Kissel <ezkissel at indiana.edu>
Date: Thursday, February 11, 2016 at 8:04 AM
To: Jithin Jose <jithin.jose at intel.com>
Cc: "Hefty, Sean" <sean.hefty at intel.com>
Subject: Re: [libfabric-users] CQ permission denied (-EACCES)

>On 2/11/2016 3:11 AM, Jose, Jithin wrote:
>>>
>>> https://github.com/disprosium8/fi_thread_test/blob/master/fi_rma_thread_rcq.c
>>>
>>> I did notice that when the above test attempts a TX op targeting a
>>> remote address beyond the allocated buffer, the test simply hangs
>>> instead of any error being generated.  You can recreate this by reducing
>>> NRBUFS and commenting out the assert on line 205.  If I blatantly
>>> increase the remote address offset at the start, I get the expected
>>> permission denied CQ error, but somewhere along the line the mr bounds
>>> checking seems to fail, or else that bad CQ event never gets popped.
>>
>> I think I found the issue here - there was a race in while trying to read and discard the incoming RMA packet, which had incorrect access permissions.
>>
>> The fix is in the same branch: https://github.com/jithinjosepkl/libfabric/tree/pr/sockets.
>> Now, the error entries are popping up at the TX side. However, the test exits with -1 in this case.
>>
>> Can you give it a try when you get a chance?
>>
>
>That fixes it for me as well.

Good to know that it fixes the issues. I will get these into the mainline.


>
>> Btw, thanks for these tests, which are helping to fix these corner case issues. :)
>
>Glad to help.  Need to do similar for the other backends yet but would 
>like to incorporate these tests into a general framework so they're not 
>as ad hoc.

Sure, that will help.

>


More information about the Libfabric-users mailing list