[openib-general] IBV_WC_LOC_PROT_ERR on RDMA read into huge paged buffers acquiered with shmget....
Dotan Barak
dotanb at dev.mellanox.co.il
Tue Nov 21 23:34:33 PST 2006
Hi Bub.
Bub Thomas wrote:
> Our application uses huge paged (shmget…..) buffers in order to get
> highest transfer speeds.
>
> The buffers (>= 16 MByte) span over multiple of the 2 MByte pages.
>
> In cases where my read buffer is smaller then a certain size let’s say
> 128 MByte I get IBV_WC_LOC_PROT_ERR into the completion queue.
>
> This happens for any RDMA transmission size.
>
> The only visible difference in user space between the buffers that
> fail and the ones that work seems to be their address returned from
> the shmat command
>
> The ones that work are out of the range 0x75E00000 (>= 128MByte) the
> ones that fail are in the range 0xFC800000. (< 128 MByte)
>
> I’m using OFED 1.1 here.
>
> Any help/idea welcome.
>
> Thomas
>
> P.S.: On Mellanox gen1 I already discovered a bug where only the first
> page of the buffer was filled correctly. This was fixed in a patch for
> IBGD-1.8.2
>
> P.P.S: To register my buffers I’m using:
>
> accessFlags = (ibv_access_flags)(IBV_ACCESS_LOCAL_WRITE |
>
> IBV_ACCESS_REMOTE_WRITE |
>
> IBV_ACCESS_REMOTE_READ |
>
> IBV_ACCESS_REMOTE_ATOMIC|
>
> IBV_ACCESS_MW_BIND);
>
> mr = ibv_reg_mr(_pd, (void*)(uintptr_t)bufferPtr, size, accessFlags);
>
Are you using 32 or 64 bit machine?
Which Linux distro are you using?
I saw a similar issue in the past in those addresses (where the MSB of
the address is set):
In IB we are using 64 bit addresses, maybe you extend the address of the
buffer using signed variable...
for example it should be:
sg.addr = (unsigned long)buf
and not
sg.addr = (long)buf
thanks
Dotan
More information about the general
mailing list