[openib-general] Re: Re: Userspace testing results (many kernels, many svn trees)
Roland Dreier
rdreier at cisco.com
Mon Jan 23 21:52:46 PST 2006
Michael> Could the high/low bits be swapped? What happends if you
Michael> change cycles_t from long long to long? Could you try
Michael> running the clock_test utility?
What seems to be happening is that mftb is giving the low 32 bits of
the timebase (as expected on ppc32). Since your get_cycles() is
returning a long long, those 32 bits get put in the most significant
32 bits of the return value, and the low 32 bits are garbage (ppc is
big endian).
If I compile clock_test for ppc32, I see that get_cycles() compiles to:
1000064c <get_cycles>:
1000064c: 7c 6c 42 e6 mftb r3
10000650: 4e 80 00 20 blr
For comparison, a function like
unsigned long long blah(void) { return 0x100000002ull; }
compiles to
00000000 <blah>:
0: 38 60 00 01 li r3,1
4: 38 80 00 02 li r4,2
8: 4e 80 00 20 blr
In other words the convention on ppc32 is that unsigned long long
return values have the high 32 bits in r3 and the low 32 bits in r4.
I think you want to use something like
typedef unsigned long long cycles_t;
static inline cycles_t get_cycles()
{
unsigned long low, hi, hi2;
do {
asm volatile ("mftbu %0" : "=r" (hi));
asm volatile ("mftb %0" : "=r" (low));
asm volatile ("mftbu %0" : "=r" (hi2));
} while (hi != hi2);
return ((unsigned long long) hi << 32) | low;
}
for ppc32. However, this is not quite enough to make things work on
all powerpc systems, because the timebase does not necessarily run at
the same speed as the CPU. For example, on an IBM JS20 blade,
clock_test prints
1 sec = 6536.8 usec
1 sec = 6537.05 usec
(both as a 32-bit and 64-bit executable) because, as /proc/cpuinfo shows:
processor : 0
cpu : PPC970FX, altivec supported
clock : 2194.624509MHz
revision : 3.0
processor : 1
cpu : PPC970FX, altivec supported
clock : 2194.624509MHz
revision : 3.0
timebase : 14318000
machine : CHRP IBM,8842-P2C
the timebase runs at about 14.3 MHz, or approx 153 times slower than
the CPU clock.
I'm not sure how you want to fix this in perftest.
- R.
More information about the general
mailing list