[ofa-general] iSER data corruption issues
Tom Tucker
tom at opengridcomputing.com
Thu Oct 4 10:44:57 PDT 2007
On Thu, 2007-10-04 at 12:14 -0400, Pete Wyckoff wrote:
> rdreier at cisco.com wrote on Wed, 03 Oct 2007 15:01 -0700:
> > > Machines are opteron, fedora 7 up-to-date with its openfab libs,
> > > kernel 2.6.23-rc6 on target. Either 2.6.23-rc6 or 2.6.22 or
> > > 2.6.18-rhel5 on initiator. For some reason, it is much easier to
> > > produce with the rhel5 kernel.
> >
> > There was a bug in mthca that caused data corruption with FMRs on
> > Sinai (1-port PCIe) HCAs. It was fixed in commit 608d8268 ("IB/mthca:
> > Fix data corruption after FMR unmap on Sinai") which went in shortly
> > before 2.6.21 was released. I don't know if the RHEL5 2.6.18 kernel
> > has this fix or not -- but if you still see the problem on 2.6.22 and
> > later kernels then this isn't the fix anyway.
>
> This is definitely it. Same test setup runs for an hour with this
> patch, but fails in tens of seconds without it. Thanks for pointing
> it out.
>
> This rhel5 kernel is 2.6.18-8.1.6. Perhaps there are newer ones
> about that have this critical patch included. I'm going to add a
> Big Fat Warning on the iser distribution about pre-2.6.21 kernels.
> It also crashes if the iSER connection drops in a certain
> easy-to-reproduce way, another reason to avoid it.
>
> Regarding the "larger" test I talked about that fails even on modern
> kernels, I'm still not able to reproduce that on my setup. I ran it
> literally all night with a hacked target that calculated the return
> buffer rather than accessing the disk. For now I'm calling that a
> separate bug and will investigate it further.
>
> Thanks to Tom and Tom for helping debug this.
>
Thanks to Roland who actually knew what it was ... ;-)
> -- Pete
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
More information about the general
mailing list