[ewg] Mellanox target workaround in SRP

David Dillow dillowda at ornl.gov
Mon Jan 10 11:02:43 PST 2011


On Mon, 2011-01-10 at 10:21 -0800, Vu Pham wrote:
> David Dillow wrote:
> > On Fri, 2011-01-07 at 20:05 -0800, Roland Dreier wrote:
> >> looking at the patch, I would guess that the corruption occurred when
> >> the target got an IO request that started at a non-page-aligned address
> >> but that spanned more than one page.
[snip]
> > Here's hoping someone from Mellanox can shed some light.
> 
> 
> I think that the patch is specific for srp initiator using Mellanox
> FMR. It tried to avoid indirect desc with Mellanox FMR having
> first-byte-offset != 0.
> Since the low level implementation of mlx4/mthca_map_phys_fmr() did
> not create + setup MPT for FMR with first_byte_offset != 0. The
> corruption can happen with any target.

Thanks for taking a look Vu -- but I'm not sure that is the problem,
either. The SRP FMR mapping code is careful to mask the SG address with
the FMR page mask, so we should never ask the HCA to map a page with the
first_byte_offset != 0. Instead, we tell the target to request an IO
virtual address appropriately offset into the first page of the FMR.

Or perhaps I misunderstood you, and it's the non-zero first byte offset
in the RDMA command on the wire that is the issue, and not the FMR setup
in the initiator? And it only affects FMR-mapped memory, not the
kernel's MR?

Thanks again,
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office




More information about the ewg mailing list