[ewg] Announcing the release of MVAPICH 1.0
Dhabaleswar Panda
panda at cse.ohio-state.edu
Thu Feb 28 21:17:48 PST 2008
The MVAPICH team is pleased to announce the availability of MVAPICH
1.0 with the following NEW features:
- New Scalable and robust job startup
- Enhanced and robust mpirun_rsh framework to provide scalable
launching on multi-thousand core clusters
- Running time of `MPI Hello World' program on 1K cores is around
4 sec and on 32K cores is around 80 sec
- Available for OpenFabrics/Gen2, OpenFabrics/Gen2-UD and
QLogic InfiniPath devices
- Performance graph at:
http://mvapich.cse.ohio-state.edu/performance/startup.shtml
- Enhanced support for SLURM
- Available for OpenFabrics/Gen2, OpenFabrics/Gen2-UD and
QLogic InfiniPath devices
- New OpenFabrics Gen2 Unreliable-Datagram (UD)-based design
for large-scale InfiniBand clusters (multi-thousand cores)
- delivers performance and scalability with constant
memory footprint for communication contexts
- Only 40MB per process even with 16K processes connected to
each other
- Performance graph at:
http://mvapich.cse.ohio-state.edu/performance/mvapich/ud_memory.shtml
- zero-copy protocol for large data transfer
- shared memory communication between cores within a node
- multi-core optimized collectives
(MPI_Bcast, MPI_Barrier, MPI_Reduce and MPI_Allreduce)
- enhanced MPI_Allgather collective
- New features for OpenFabrics Gen2-IB interface
- enhanced coalescing support with varying degree of coalescing
- support for ConnectX adapter
- support for asynchronous progress at both sender and receiver
to overlap computation and communication
- multi-core optimized collectives (MPI_Bcast)
- tuned collectives (MPI_Allgather, MPI_Bcast)
based on network adapter characteristics
- Performance graph at:
http://mvapich.cse.ohio-state.edu/performance/collective.shtml
- network-level fault tolerance with Automatic Path Migration (APM)
for tolerating intermittent network failures over InfiniBand.
- New Support for QLogic InfiniPath adapters
- high-performance point-to-point communication
- optimized collectives (MPI_Bcast and MPI_Barrier) with k-nomial
algorithms while exploiting multi-core architecture
- Optimized and high-performance ADIO driver for Lustre
- This MPI-IO support is a contribution from Future Technologies Group,
Oak Ridge National Laboratory.
(http://ft.ornl.gov/doku/doku.php?id=ft:pio:start)
- Performance graph at:
http://mvapich.cse.ohio-state.edu/performance/mvapich/romio.shtml
- Flexible user defined processor affinity for better resource utilization
on multi-core systems
- flexible process bindings to cores
- allows memory-intensive applications to run with a subset of cores
on each chip for better performance
More details on all features and supported platforms can be obtained
by visiting the following URL:
http://mvapich.cse.ohio-state.edu/overview/mvapich/features.shtml
MVAPICH 1.0 continues to deliver excellent performance. Sample
performance numbers include:
- with OpenFabrics/Gen2 on EM64T quad-core with PCIe and ConnectX-DDR:
- 1.51 microsec one-way latency (4 bytes)
- 1404 MB/sec unidirectional bandwidth
- 2713 MB/sec bidirectional bandwidth
- with PSM on Opteron with Hypertransport and QLogic-SDR:
- 1.25 microsec one-way latency (4 bytes)
- 953 MB/sec unidirectional bandwidth
- 1891 MB/sec bidirectional bandwidth
Performance numbers for all other platforms, system configurations and
operations can be viewed by visiting `Performance' section of the
project's web page.
For downloading MVAPICH 1.0, associated user guide and
accessing the anonymous SVN, please visit the following URL:
http://mvapich.cse.ohio-state.edu
All feedbacks, including bug reports and hints for performance tuning,
are welcome. Please post it to the mvapich-discuss mailing list.
Thanks,
The MVAPICH Team
======================================================================
MVAPICH/MVAPICH2 project is currently supported with funding from
U.S. National Science Foundation, U.S. DOE Office of Science,
Mellanox, Intel, Cisco Systems, QLogic, Sun Microsystems and Linux
Networx; and with equipment support from Advanced Clustering, AMD,
Appro, Chelsio, Dell, Fujitsu, Fulcrum, IBM, Intel, Mellanox,
Microway, NetEffect, QLogic and Sun Microsystems. Other technology
partner includes Etnus.
======================================================================
More information about the ewg
mailing list