[ofa-general] Question on IB RDMA read timing.
Rick Jones
rick.jones2 at hp.com
Wed Oct 17 09:17:30 PDT 2007
Gleb Natapov wrote:
> On Wed, Oct 17, 2007 at 09:44:04AM +0200, Dotan Barak wrote:
>
>>Hi.
>>
>>Bharath Ramesh wrote:
>>
>>>I wrote a simple test program to actual time it takes for RDMA read over
>>>IB. I find a huge difference in the numbers returned by timing. I was
>>>wondering if someone could help me in finding what I might be doing
>>>wrong in the way I am measuring the time.
>>>
>>>Steps I do for timing is as follows.
>>>
>>>1) Create the send WR for RDMA Read.
>>>2) call gettimeofday ()
>>>3) ibv_post_send () the WR
>>>4) Loop around ibv_poll_cq () till I get the completion event.
>>>5) call gettimeofday ();
>>>
>>>The difference in time would give me the time it takes to perform RDMA
>>>read over IB. I constantly get around 35 microsecs as the timing which
>>>seems to be really large considering the latency of IB. I am measuring
>>>the time for transferring 4K bytes of data. If anyone wants I can send
>>>the code that I have written. I am not subscribed to the list, if you
>>>could please cc me in the reply.
>>>
>>
>>I don't familiar with the implementation of gettimeofday, but i believe
>>that this function do a context switch
>>(and/or spend some time in the function to fill the struct that you supply
>>to it)
>>
>
> Here:
> struct timeval tv_s, tv_e;
> gettimeofday(&tv_s, NULL);
> gettimeofday(&tv_e, NULL);
> printf("%d\n", tv_e.tv_usec - tv_s.tv_usec);
> Compile and run it. The overhead of two calls to gettimeofday is at most
> 1 microsecond.
Unless there is contention with other gettimeofday() calls on the system - on
SMP etc there are locks involved in making sure that each call to gettimeofday()
does not go backwards and the like, and on some systems, with enough callers to
gettimeofday() one can run into lock contention. So, while 99 times out of ten
gettimeofday() may be "cheap" it really isn't a good idea to ass-u-me it will
always be cheap.
And besides, the most efficient call is the one which is never made, so the
suggestion to perform N operations between the calls is probably still a good
one. Even for measuring the overhead of gettimeofday() :)
Also, while it may not be so much the case these days, certainly in the past
there were "gettimeofday()" implementations which may have rather coarse
granularity.
Now, some CPUs offer interval timer/registers/whatever - for example the ITC on
Itanium or CR16 on PA-RISC, I'm sure there are other examples - which can be
used for measuring very short things. Under some OSes - HP-UX and Solaris are
two with which I am familiar - there is a "gethrtime()" interface which uses
those without the user having to deal with inline assembly. That should have
lower overhead than gettimeofday() although even then it would probably be best,
if one is indeed going for the average, to use those to measure the time to
perform N operations.
If one does use gethrtime(), it should only be for measuring short things, and
those "timestamps" should not be interspersed with those from gettimeofday().
The two are really separate "timespaces" if you will. Gethrtime() does not get
tick adjustment like gettimeofday() does/can.
rick jones
FWIW, netperf uses gettimeofday() to measure the overall runtime of a netperf
test, and gethrtime() (when available) to measure the individual times for
"transactions" such as the exchange of a request/response, or time spend in
send() or recv() or whatnot.
> --
> Gleb.
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
More information about the general
mailing list