[ewg] Mellanox target workaround in SRP
Vu Pham
vuhuong at mellanox.com
Mon Jan 10 10:21:07 PST 2011
David Dillow wrote:
> On Fri, 2011-01-07 at 20:05 -0800, Roland Dreier wrote:
>> > I'm sure this was tested and shown to fix the problem; I'm just confused
>> > as to what the problem really was and if this is still relevant. Can
>> > someone please enlighten me?
>>
>> At this point I'm afraid it's all lost in the mists of time,
>
> Yep, that's my fear. And since it is a corruption bug, I've got to tread
> lightly in this area. :/
>
I don't recall to discuss or review this patch with Michael Tsirkin when he summited the patch.
>> looking at the patch, I would guess that the corruption occurred when
>> the target got an IO request that started at a non-page-aligned address
>> but that spanned more than one page.
>
> That's my thought as well, but then I'm not sure this really solved
> their problem. It may be more likely to occur in the FMR case, but the
> initiator enables clustering, so blk_rq_map_sg() could generate the same
> kinds of requests for both direct and indirect descriptors, even without
> FMR. This looks to have been true since the initiator was added to the
> kernel, though it is possible I'm misreading the code.
>
>> I don't know if the target was ever fixed, or whether that target code
>> has any relevance today.
>
> Here's hoping someone from Mellanox can shed some light.
I think that the patch is specific for srp initiator using Mellanox FMR. It tried to avoid indirect desc with Mellanox FMR having first-byte-offset != 0.
Since the low level implementation of mlx4/mthca_map_phys_fmr() did not create + setup MPT for FMR with first_byte_offset != 0. The corruption can happen with any target.
-vu
More information about the ewg
mailing list