[ofa-general] ***SPAM*** OSU mpi latency > 100msec and > 1msec
Jeff Squyres
jsquyres at cisco.com
Fri Aug 8 07:28:34 PDT 2008
These type of effects can be caused by congestion in the network,
location of the processes in the network, and/or other sources of
jitter on your hosts (e.g., other processes interrupting and running,
etc.).
Even the 13us looks pretty high or iWARP or IB; perhaps that's caused
by the outliers in your data set. FWIW: we normally get in the 1-2us
latency range for IB, assuming you have top-of-the-line servers, HCAs,
and exactly one switch hop between the two servers.
On Aug 8, 2008, at 3:14 AM, Rajouri Jammu wrote:
> Hi,
> I modified/instrumented the OSU MP latency benchmark to measure time
> taken by each transaction in order to get min and max latencies, in
> addition to the average that's reported currently.
> I noticed that some of the transactions, albeit few, took > 100usec
> and > 1msec.
>
> Does anybody have any ideas about what could be causing such large
> round trip times (>1msec) for a few transactions while the average
> looks pretty good ( 10usec ranges) ?
> Is it network or system issues?
>
> Here is a snapshot of the output and attached is the modified code.
>
> i'm using OFED 1.3, CentOS 5 and openmpi-1.2.5.
>
> Any insights or ideas would very helpful. Thanks in advance.
>
> Below shows # of transactions over 100usec and 1msec.
> Iteration count was set to 60000 for each test.
> Latency is round trip time.
>
> --------------------------------------------------------------------------
> # OSU MPI Latency Test v3.0
> # Size Latency (us)
> 0 13.30 over_100usec: 13 over_1msec: 2 i
> 601000
> 1 13.51 over_100usec: 12 over_1msec: 0 i
> 601000
> 2 13.51 over_100usec: 13 over_1msec: 1 i
> 601000
> 4 13.81 over_100usec: 42 over_1msec: 33 i
> 601000
> 8 13.92 over_100usec: 36 over_1msec: 25 i
> 601000
> 16 13.90 over_100usec: 10 over_1msec: 0 i
> 601000
> 32 14.14 over_100usec: 54 over_1msec: 44 i
> 601000
> 64 14.32 over_100usec: 11 over_1msec: 1 i
> 601000
> 128 15.38 over_100usec: 10 over_1msec: 0 i
> 601000
> 256 15.94 over_100usec: 14 over_1msec: 2 i
> 601000
> 512 16.74 over_100usec: 77 over_1msec: 65 i
> 601000
> 1024 21.07 over_100usec: 17 over_1msec: 0 i
> 601000
> 2048 24.05 over_100usec: 17 over_1msec: 1 i
> 601000
> 4096 29.99 over_100usec: 37 over_1msec: 5 i
> 601000
> 8192 41.71 over_100usec: 39 over_1msec: 0 i
> 601000
>
>
>
>
> <osu_latency_profile.c>_______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
--
Jeff Squyres
Cisco Systems
More information about the general
mailing list