[ofa-general] Re: When is the next planned release of libmlx4?

Roland Dreier rdreier at cisco.com
Fri Jun 12 21:24:42 PDT 2009


 > I'm not sure what the chip's expectation is for the actual bus
 > transfers in this area, but I think you are right to be concerned
 > about atomicity, even when transfering based on longs.

The chip docs seem to suggest that we're OK as long as we do 4-byte
writes aligned to 4 bytes.

 > It is worth looking at using SSE instructions to burst transfer the
 > entire message in one atomic go.

I'm not aware of any SSE instructions that work on chunks bigger than 16
bytes at a time.

In fact the latest mlx4 kernel driver maps the blueflame page to
userspace with write-combining enabled, and this improves performance
quite a bit.  The HCA doesn't care what order that the CPU drains the WC
buffer in (according to docs at least)

 - R.



More information about the general mailing list