[ewg] Announcing the release of MVAPICH 0.9.9

Fri Apr 27 21:18:21 PDT 2007

The MVAPICH team is pleased to announce the availability of MVAPICH
0.9.9 with the following NEW features:

- Message coalescing support to enable reduction of per Queue-pair
  send queues for reduction in memory requirement on large scale
  clusters. This design also increases the small message messaging
  rate significantly.

  Reduction in memory requirement on large scale clusters can be 
  found here:

  http://mvapich.cse.ohio-state.edu/performance/mvapich/mem_0_9_9.shtml

- Designs for avoiding hot-spots in networks of large-scale clusters

  - Multi-pathing support leveraging LMC mechanism
  - Multi-port/Multi-HCA support for enabling user processes to bind to 
    different IB ports for balanced communication performance
    on multi-core platforms

- Multi-core optimized scalable shared memory design

- Memory Hook support provided by integration with ptmalloc2 library. 
  This provides safe release of memory to the Operating System and
  is expected to benefit the memory usage of applications that 
  frequently use malloc and free operations.

- Optimized, high-performance shared memory aware collective
  operations for multi-core platforms

  Performance benefits for sample collective operations using the 
  optimized schemes can be found here:

  http://mvapich.cse.ohio-state.edu/performance/collective.shtml  

- Shared-Memory only channel (This interface support is useful for
  running MPI jobs on multi-processor systems without using any 
  high-performance network. For example, multi-core servers, 
  desktops, and laptops; and clusters with serial nodes.)

A new "Multiple-pair Bandwidth and Message Rate" test is also
available as a part of OSU Benchmarks.

A newly designed MVAPICH web site (http://mvapich.cse.ohio-state.edu/)
is now available for MVAPICH/MVAPICH2 users for easier navigation.

More details on all features and supported platforms can be obtained
by visiting the following URL:

http://mvapich.cse.ohio-state.edu/overview/mvapich/features.shtml

MVAPICH 0.9.9 continues to deliver excellent performance. Sample
performance numbers include:

  - OpenFabrics/Gen2 on EM64T quad-core with PCIe and ConnectX-DDR:
    (These numbers are preliminary and optimizations are on-going.)
        - 1.39 microsec one-way latency (1 byte)
        - 1419 MB/sec unidirectional bandwidth 
        - 2769 MB/sec bidirectional bandwidth  

    More detailed performance numbers for MVAPICH 0.9.9 on ConnectX 
    are available from the following URL:

    http://mvapich.cse.ohio-state.edu/performance/mvapich/em64t/MVAPICH-em64t-gen2-ConnectX.shtml

  - OpenFabrics/Gen2 on EM64T with PCI-Ex and IBA-DDR:
        - 2.93 microsec one-way latency (4 bytes)
        - 1405 MB/sec unidirectional bandwidth
        - 2702 MB/sec bidirectional bandwidth

Performance numbers for all other platforms, system configurations and
operations can be viewed by visiting `Performance' section of the
project's web page.

For downloading MVAPICH 0.9.9 package and accessing the anonymous SVN,
please visit the following URL:

http://mvapich.cse.ohio-state.edu/

All feedbacks, including bug reports, hints for performance tuning,
patches and enhancements are welcome. Please post it to
mvapich-discuss mailing list.

Thanks, 

The MVAPICH Team

======================================================================
MVAPICH/MVAPICH2 project is currently supported with funding from
U.S. National Science Foundation, U.S. DOE Office of Science,
Mellanox, Intel, Cisco Systems, QLogic, Sun Microsystems and Linux
Networx; and with equipment support from Advanced Clustering, AMD,
Apple, Appro, Chelsio, Dell, Fujitsu, Fulcrum, IBM, Intel, Mellanox,
Microway, NetEffect, QLogic and Sun Microsystems. Other technology
partner includes Etnus.
======================================================================