[openib-general] mwrite64 - need for uar object in access layer
roland at topspin.com
Tue Sep 21 07:55:58 PDT 2004
Troy> How does just using the floating point unit compare the the
Troy> SSE codepath? In a past life I had to get a flash driver for
Troy> a 32 bit PPC board working that *had* to have 64 bit access
Troy> to flash.
It's a pretty huge loss, because saving/restoring the FPU state
requires writing/reading something like 170 bytes. With SSE we can
just save the 8 bytes of XMM register that we actually use. Even so
I'm not convinced SSE is a win over just using a lock because saving
CR0 is so expensive.
As I said, I'd be curious to see benchmarks of other approaches. I
think there's definitely room for improvement if someone is interested
in working on this.
More information about the general