[ofa-general] ***SPAM*** OSU mpi latency > 100msec and > 1msec

Jeff Squyres jsquyres at cisco.com
Fri Aug 8 07:28:34 PDT 2008


These type of effects can be caused by congestion in the network,  
location of the processes in the network, and/or other sources of  
jitter on your hosts (e.g., other processes interrupting and running,  
etc.).

Even the 13us looks pretty high or iWARP or IB; perhaps that's caused  
by the outliers in your data set.  FWIW: we normally get in the 1-2us  
latency range for IB, assuming you have top-of-the-line servers, HCAs,  
and exactly one switch hop between the two servers.


On Aug 8, 2008, at 3:14 AM, Rajouri Jammu wrote:

> Hi,
> I modified/instrumented the OSU MP latency benchmark to measure time  
> taken by each transaction in order to get min and max latencies, in  
> addition to the average that's reported currently.
> I noticed that some of the transactions, albeit few, took > 100usec  
> and > 1msec.
>
> Does anybody have any ideas about what could be causing such large  
> round trip times (>1msec) for a few transactions while the average  
> looks pretty good ( 10usec ranges) ?
> Is it network or system issues?
>
> Here is a snapshot of the output and attached is the modified code.
>
> i'm using OFED 1.3, CentOS 5 and openmpi-1.2.5.
>
> Any insights or ideas would very helpful. Thanks in advance.
>
> Below shows # of transactions over 100usec and  1msec.
> Iteration count was set to 60000 for each test.
> Latency is round trip time.
>
> --------------------------------------------------------------------------
> # OSU MPI Latency Test v3.0
> # Size            Latency (us)
> 0                        13.30   over_100usec: 13 over_1msec: 2 i  
> 601000
> 1                        13.51   over_100usec: 12 over_1msec: 0 i  
> 601000
> 2                        13.51   over_100usec: 13 over_1msec: 1 i  
> 601000
> 4                        13.81   over_100usec: 42 over_1msec: 33 i  
> 601000
> 8                        13.92   over_100usec: 36 over_1msec: 25 i  
> 601000
> 16                       13.90   over_100usec: 10 over_1msec: 0 i  
> 601000
> 32                       14.14   over_100usec: 54 over_1msec: 44 i  
> 601000
> 64                       14.32   over_100usec: 11 over_1msec: 1 i  
> 601000
> 128                      15.38   over_100usec: 10 over_1msec: 0 i  
> 601000
> 256                      15.94   over_100usec: 14 over_1msec: 2 i  
> 601000
> 512                      16.74   over_100usec: 77 over_1msec: 65 i  
> 601000
> 1024                     21.07   over_100usec: 17 over_1msec: 0 i  
> 601000
> 2048                     24.05   over_100usec: 17 over_1msec: 1 i  
> 601000
> 4096                     29.99   over_100usec: 37 over_1msec: 5 i  
> 601000
> 8192                     41.71   over_100usec: 39 over_1msec: 0 i  
> 601000
>
>
>
>
> <osu_latency_profile.c>_______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


-- 
Jeff Squyres
Cisco Systems




More information about the general mailing list