[openib-general] Announcing the release of MVAPICH 0.9.8 with on-demand connection management, fault-tolerance and advanced multi-rail scheduling support

Dhabaleswar Panda panda at cse.ohio-state.edu
Sun Jul 30 20:05:40 PDT 2006


The MVAPICH team is pleased to announce the availability of MVAPICH
0.9.8 with the following new features:

  - On-demand connection management using native InfiniBand 
    Unreliable Datagram (UD) support. This feature enables InfiniBand
    connections to be setup dynamically and has `near constant'
    memory usage with increasing number of processes. 

    This feature together with the Shared Receive Queue (SRQ) feature 
    (available since MVAPICH 0.9.7) enhances the scalability
    of MVAPICH on multi-thousand node clusters. 

    Performance of applications and memory scalability using on-demand
    connection management and SRQ support can be seen by visiting 
    the following URL:

    http://nowlab.cse.ohio-state.edu/projects/mpi-iba/perf-apps.html 

  - Support for Fault Tolerance: Mem-to-mem reliable data transfer
    (detection of I/O bus error with 32bit CRC and retransmission in
    case of error). This mode enables MVAPICH to deliver messages
    reliably in presence of I/O bus errors.

  - Multi-rail communication support with flexible scheduling policies:
      - Separate control of small and large message scheduling
      - Three different scheduling policies for small messages: 
            - Using First Subchannel, Round Robin and Process Binding 
      - Six different scheduling policies for large messages: 
            - Round Robin, Weighted striping, Even striping, 
              Stripe Blocking, Adaptive Striping and Process Binding

  - Shared library support for Solaris 

  - Integrated and easy-to-use build script which automatically
    detects system architecture and InfiniBand adapter types and
    optimizes MVAPICH for any particular installation
 
More details on all features and supported platforms can be obtained
by visiting the following URL:

http://nowlab.cse.ohio-state.edu/projects/mpi-iba/mvapich_features.html

MVAPICH 0.9.8 continues to deliver excellent performance. Sample
performance numbers include:

  - OpenIB/Gen2 on EM64T with PCI-Ex and IBA-DDR:
        - 2.93 microsec one-way latency (4 bytes)
        - 1471 MB/sec unidirectional bandwidth
        - 2678 MB/sec bidirectional bandwidth

  - OpenIB/Gen2 on EM64T with PCI-Ex and IBA-DDR (dual-rail):
        - 2534 MB/sec unidirectional bandwidth
        - 3003 MB/sec bidirectional bandwidth

  - OpenIB/Gen2 on Opteron with PCI-Ex and IBA-DDR:
        - 2.65 microsec one-way latency (4 bytes)
        - 1399 MB/sec unidirectional bandwidth
        - 2253 MB/sec bidirectional bandwidth

  - Solaris uDAPL/IBTL on Opteron with PCI-Ex and IBA-SDR:
        - 3.86 microsec one-way latency (4 bytes)
        - 981 MB/sec unidirectional bandwidth
        - 1856 MB/sec bidirectional bandwidth

  - OpenIB/Gen2 uDAPL on EM64T with PCI-Ex and IBA-SDR:
        - 3.80 microsec one-way latency (4 bytes)
        - 963 MB/sec unidirectional bandwidth
        - 1851 MB/sec bidirectional bandwidth

  - OpenIB/Gen2 uDAPL on Opteron with PCI-Ex and IBA-DDR:
        - 2.81 microsec one-way latency (4 bytes)
        - 1411 MB/sec unidirectional bandwidth
        - 2252 MB/sec bidirectional bandwidth

Performance numbers for all other platforms, system configurations and
operations can be viewed by visiting `Performance' section of the
project's web page.

For downloading MVAPICH 0.9.8 package and accessing the anonymous SVN,
please visit the following URL:

http://nowlab.cse.ohio-state.edu/projects/mpi-iba/

A stripped down version of this release is also available at the
OpenIB SVN.

All feedbacks, including bug reports, hints for performance tuning,
patches and enhancements are welcome. Please post it to
mvapich-discuss mailing list.

Thanks, 

MVAPICH Team at OSU/NBCL 

======================================================================
MVAPICH/MVAPICH2 project is currently supported with funding from
U.S. National Science Foundation, U.S. DOE Office of Science,
Mellanox, Intel, Cisco Systems, Sun Microsystems and Linux Networx;
and with equipment support from Advanced Clustering, AMD, Apple,
Appro, Dell, IBM, Intel, Mellanox, Microway, PathScale, SilverStorm
and Sun Microsystems. Other technology partner includes Etnus.
======================================================================





More information about the general mailing list