[ofa-general] ib_rdma_bw - bandwidth calculation
Viral Mehta
viral.mehta at einfochips.com
Fri May 8 06:45:06 PDT 2009
Hi,
While running below ib_rdma_bw on 32bit platform, I am getting unexpected low throughput.
Server: ib_rdma_bw -p 5019 -s 1048576 -t 500 -n 5000 -b -c
Client: ib_rdma_bw -p 5019 -s 1048576 -t 500 -n 5000 -b -c 100.168.54.49
(If iterations are changed to 500, I am getting expected throughput)
Looking at the code I found,
ib_rdma_bw.c in perftest package has following code
>{
> double cycles_to_units;
> unsigned long tsize; /* Transferred size, in megabytes */
> ....
> ....
> cycles_to_units = get_cpu_mhz(0) * 1000000;
>
> printf("%d: Bandwidth average: %g MB/sec\n", pid,
> tsize * iters * cycles_to_units /
> (tcompleted[iters - 1] - tposted[0])
>/ 0x100000);
>}
>
Here, tsize is "unsigned long" and which is of 4Bytes on 32bit
platforms and 8Bytes on 64bit platforms.
I run test for 1M datasize and 5000 iterations as
above, the calculation (tsize * iters)
overflows "unsigned long" limit and thus gives unexpected
result as low throughput.
Correct fix should be applied in ib_rdma_bw application. Either change
calculation from (tsize * iters * cycles_to_units) to (
cycles_to_units * tsize * iters ) Or to change tsize to double.
Should I go ahead and submit a patch ?
Viral Mehta, Embedded Software Engineer, www.einfochips.com
P.S. -
However, I do understand that we can overflow double boundary as well if we run test for higher datasize and higher iterations.
Better way to calculate bandwidth would be after every fix number of iterations (say 100).
More information about the general
mailing list