[ofa-general] Re: When is the next planned release of libmlx4?

Doug Ledford dledford at redhat.com
Tue Jun 16 06:25:43 PDT 2009


On Fri, 2009-06-12 at 21:22 -0700, Roland Dreier wrote:
> > Valgrind replaces the libc memcpy call with a simple version that
>  > copies a byte at a time (in order).  If libmlx4 is not built with
>  > --with-valgrind, valgrind considers each write an invalid write and
>  > spends a very long time after each write updating its error database.
>  > We experimented with replacing the Valgrind error database update
>  > with a configurable spin loop and found that if we put a delay of
>  > around 100,000 cycles between writes in the 'byte memcpy' when
>  > writing to the blueflame page, that a sent message gets
>  > lost/misplaced in a simple testcase with two MPI_barriers back to
>  > back (resulting in a hang because not all processes exit the first
>  > barrier).  Our theory is the card sees 'byte' writes to the blueflame
>  > page and due to the long delay, uses the information before it is all
>  > written out (and thus getting wrong info).
> 
> That makes sense.  The HW documentation says that blueflame writes must
> be done in aligned chunks of at least 4 bytes, so it's not surprising
> that byte writes confuse the HW in some cases.

Hi Roland.  I'm removing the XRC patch from our kernel like we discussed
(for the other people on the list, it's because the XRC API is likely to
change by the time integration into the official packages are done, and
we don't want to ship one API for the support now and then a different
API later).  This also means that I need to respin libibverbs and all
driver packages.  However, my deadline is *very* tight for getting this
done.  Any chance I could talk you into rushing the release?  Today
would be best, but tomorrow would work too.

-- 
Doug Ledford <dledford at redhat.com>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090616/964f4710/attachment.sig>


More information about the general mailing list