[ewg] Mellanox target workaround in SRP

David Dillow dillowda at ornl.gov
Fri Jan 7 14:35:17 PST 2011


I have question regarding workaround introduced in commit 559ce8f1 of
the mainline tree:

    IB/srp: Work around data corruption bug on Mellanox targets
    
    Data corruption has been seen with Mellanox SRP targets when FMRs
    create a memory region with I/O virtual address != 0.  Add a
    workaround that disables FMR merging for Mellanox targets (OUI 0002c9).

I don't see how this can make a difference to the target -- it sees an
address and length, and there should be no visible difference to it when
it gets an FMR versus a direct-mapped region of the same space, right?
And how is it different than getting a direct or indirect descriptor
with a similar offset?

I could see there being a bug on the initiator HCA not liking such FMR
mappings, but then it should be keyed off of the vendor of our HCA and
not the target.

I'm sure this was tested and shown to fix the problem; I'm just confused
as to what the problem really was and if this is still relevant. Can
someone please enlighten me?
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office





More information about the ewg mailing list