[ewg] Re: [ofa-general] iser/lustre memfree issues
    Or Gerlitz 
    ogerlitz at voltaire.com
       
    Wed Apr 11 03:39:57 PDT 2007
    
    
  
Roland Dreier wrote:
>  > 472 Data corruption with Lustre+OFED when using FMR on memfree HCAs
>  > 
>  > We see it also with iser, basically only on scsi --read-- which from
>  > IB perspective is RDMA write from the target to the initiator.
>  > 
>  > The env we see it is Sinai (25204) hw_ver=A0 and fw_ver=1.2.0
>  > 
>  > Ishai did not manage to reproduce it with SRP, but the fact it
>  > reproduced with two independent ULPs makes it a blocker, i think.
> 
> We definitely need more info here.  Why are you confident that the two
> problems are the same bug?
> 
> Have you tested with mem-free Arbel, and does the problem occur there
> too?  Or have you only tested Sinai?  Does the problem go away if you
> remove the MTHCA_FLAG_SINAI_OPT flag from the mthca_hca_table[] entry
> in mthca_main.c?
Hi Roland,
We don't have memfree Arbel here however, your suggestion to remove the 
MTHCA_FLAG_SINAI_OPT flag from the mthca_hca_table[] entry in 
mthca_main.c seemed to provide a work around (and hopefully a direction 
to solve the problem...) it is running for two hours now without 
reproducing the corruption. I will leave it over night and let you know.
Do you have any idea what why does the code breaks with 
MTHCA_FLAG_SINAI_OPT ?
thanks again,
Or.
    
    
More information about the ewg
mailing list