<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=us-ascii" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.6001.18876"></HEAD>
<BODY>
<DIV><SPAN class=904515806-23052010>
<DIV><SPAN class=803471211-20052010><FONT face=Arial><FONT size=2><SPAN
class=904515806-23052010>The f</SPAN>unction <SPAN
class=904515806-23052010>that used </SPAN>for time measuring <SPAN
class=904515806-23052010> - </SPAN>QueryPerformanceCounter<SPAN
class=904515806-23052010>()</SPAN><SPAN
class=904515806-23052010> </SPAN> <SPAN class=904515806-23052010>- not
always returns proper results.</SPAN></FONT></FONT></SPAN></DIV>
<DIV><SPAN class=803471211-20052010><FONT face=Arial><FONT size=2>According to
MS<SPAN class=904515806-23052010> (<A
href="http://msdn.microsoft.com/en-us/library/ms644904%28VS.85%29.aspx">http://msdn.microsoft.com/en-us/library/ms644904%28VS.85%29.aspx</A>)</SPAN></FONT></FONT></SPAN></DIV>
<DIV><SPAN class=803471211-20052010><FONT size=2 face=Arial>1. It can return
0</FONT></SPAN></DIV>
<DIV><SPAN class=803471211-20052010><FONT size=2 face=Arial>2. By default, one
does not set the affinity and thus this function' call can be adressed to any of
the CPUs and you can get different results</FONT></SPAN></DIV>
<DIV><SPAN class=803471211-20052010><FONT size=2 face=Arial>3. As a consequence,
a time difference between post and completion can be infinitely close to
0.</FONT></SPAN></DIV>
<DIV><SPAN class=803471211-20052010><FONT face=Arial><FONT size=2>4. It will not
affect the BW but may affect Peak BW on <SPAN
class=904515806-23052010>some</SPAN> <SPAN
class=904515806-23052010>computers</SPAN></FONT></FONT></SPAN></DIV></SPAN></DIV>
<DIV><FONT size=2 face=Arial><SPAN
class=904515806-23052010></SPAN></FONT> </DIV>
<DIV><FONT size=2 face=Arial><SPAN class=904515806-23052010>signed-off by:
Alexander Naslednikov (xalex at mellanox.co.il)</SPAN></FONT></DIV>
<DIV><FONT size=2 face=Arial>Index:
D:/windows/MLNX_VPI_trunk/tools/perftests/user/send_bw/send_bw.c<BR>===================================================================<BR>---
D:/windows/MLNX_VPI_trunk/tools/perftests/user/send_bw/send_bw.c (revision
5893)<BR>+++
D:/windows/MLNX_VPI_trunk/tools/perftests/user/send_bw/send_bw.c (revision
5894)<BR>@@ -571,13 +571,17
@@<BR> cycles_t t;<BR> <BR> <BR>+ tsize
= duplex ? 2 : 1;<BR>+ tsize = tsize * size;<BR>+<BR> opt_delta
= tcompleted[opt_posted] - tposted[opt_completed];<BR>-<BR>+#define
MAX_AVAILABLE_BW 40000000<BR> /* Find the peak bandwidth
*/<BR> for (i = 0; i < (int)iters; ++i)<BR> for
(j = i; j < (int)iters; ++j) {<BR> t = (tcompleted[j]
- tposted[i]) / (j - i + 1);<BR>- if (t < opt_delta)
{<BR>+ if (t < opt_delta && t > (tsize /
MAX_AVAILABLE_BW)) {<BR>+ // Avoid the sitatuation when
opt_delta is infinitely close to
0<BR> opt_delta =
t;<BR> opt_posted =
i;<BR> opt_completed = j;<BR>@@ -586,8 +590,6
@@<BR> <BR> cycles_to_units =
get_cpu_mhz();<BR> <BR>- tsize = duplex ? 2 : 1;<BR>- tsize =
tsize *
size;<BR> printf("%7d
%d
%7.2f
%7.2f \n",<BR> size,iters,tsize
* cycles_to_units / opt_delta / 0x100000,<BR> (uint64_t)tsize *
iters * cycles_to_units /(tcompleted[iters - 1] - tposted[0]) /
0x100000);<BR>Index:
D:/windows/MLNX_VPI_trunk/tools/perftests/user/write_bw/write_bw.c<BR>===================================================================<BR>---
D:/windows/MLNX_VPI_trunk/tools/perftests/user/write_bw/write_bw.c (revision
5893)<BR>+++
D:/windows/MLNX_VPI_trunk/tools/perftests/user/write_bw/write_bw.c (revision
5894)<BR>@@ -501,13 +501,17
@@<BR> cycles_t t;<BR> <BR> <BR>+ tsize
= duplex ? 2 : 1;<BR>+ tsize = tsize * size;<BR>+<BR> opt_delta
= tcompleted[opt_posted] - tposted[opt_completed];<BR>-<BR>+#define
MAX_AVAILABLE_BW 40000000<BR> /* Find the peak bandwidth
*/<BR> for (i = 0; i < iters * user_param->numofqps;
++i)<BR> for (j = i; j < iters * user_param->numofqps;
++j) {<BR> t = (tcompleted[j] - tposted[i]) / (j - i +
1);<BR>- if (t < opt_delta) {<BR>+ if (t
< opt_delta && t > (tsize / MAX_AVAILABLE_BW))
{<BR>+ // Avoid the sitatuation when opt_delta is
infinitely close to 0<BR> opt_delta =
t;<BR> opt_posted =
i;<BR> opt_completed = j;<BR>@@ -517,8 +521,6
@@<BR> <BR> cycles_to_units =
get_cpu_mhz();<BR> <BR>- tsize = duplex ? 2 : 1;<BR>- tsize =
tsize *
size;<BR> printf("%7d
%d
%7.2f
%7.2f\n",<BR> size,iters,tsize * cycles_to_units / opt_delta /
0x100000,<BR> (uint64_t)tsize * iters * user_param->numofqps
* cycles_to_units /(tcompleted[(iters* user_param->numofqps) - 1] -
tposted[0]) / 0x100000);<BR>Index:
D:/windows/MLNX_VPI_trunk/tools/perftests/user/read_bw/read_bw.c<BR>===================================================================<BR>---
D:/windows/MLNX_VPI_trunk/tools/perftests/user/read_bw/read_bw.c (revision
5893)<BR>+++
D:/windows/MLNX_VPI_trunk/tools/perftests/user/read_bw/read_bw.c (revision
5894)<BR>@@ -467,13 +467,17
@@<BR> cycles_t t;<BR> <BR> <BR>+ tsize
= duplex ? 2 : 1;<BR>+ tsize = tsize * size;<BR>+<BR> opt_delta
= tcompleted[opt_posted] - tposted[opt_completed];<BR>-<BR>+#define
MAX_AVAILABLE_BW 40000000<BR> /* Find the peak bandwidth
*/<BR> for (i = 0; i < iters; ++i)<BR> for (j =
i; j < iters; ++j) {<BR> t = (tcompleted[j] -
tposted[i]) / (j - i + 1);<BR>- if (t < opt_delta)
{<BR>+ if (t < opt_delta && t > (tsize /
MAX_AVAILABLE_BW)) {<BR>+ // Avoid the sitatuation when
opt_delta is infinitely close to
0<BR> opt_delta =
t;<BR> opt_posted =
i;<BR> opt_completed = j;<BR>@@ -482,8 +486,6
@@<BR> <BR> cycles_to_units = get_cpu_mhz()
;<BR> <BR>- tsize = duplex ? 2 : 1;<BR>- tsize = tsize *
size;<BR> printf("%7d
%d
%7.2f
%7.2f\n",<BR> size,iters,tsize *
cycles_to_units / opt_delta /
0x100000,<BR> (uint64_t)tsize *
iters * cycles_to_units /(tcompleted[iters - 1] - tposted[0]) /
0x100000);<BR></DIV></FONT></BODY></HTML>