Release Notes: This release adds high-performance MPI communication from NVIDIA GPU device memory (to/from other devices and host memory) with IPC, collectives ,and datatype support; CPU binding granularity at socket and numanode level; checkpoint-restart and run-through stabilization with Nemesis; suspend/resume; and enhanced integration with SLURM and PBS. Network Fault Resiliency (NFR) has been added.
Release Notes: New features include improved support for fault tolerance, support for the ARMCI API, and non-collective group creation functionality. There are numerous bugfixes.
Release Notes: Enhancements include a Nemesis-based interface, process manager support, rail binding with processes for multirail configurations, message coalescing, large data transfers, dynamic process migration, fast process-level fault-tolerance with checkpoint-restart, network-level fault-tolerance with Automatic Path Migration, RDMA CM support, iWARP support, optimized collectives, multi-pathing, RDMA read- and write-based designs, polling and blocking-based communication progress, multi-core optimized and scalable shared memory support, and LiMIC2-based kernel-level shared memory support.
Release Notes: The Shared-Memory-Nemesis interface was added, providing native shared memory support on multi-core platforms where communication is required only within a node. Support for 3D torus topology with appropriate SL settings was added. Quality of Service (QoS) support with multiple InfiniBand SL was added. GPU acceleration support was added. Fast Checkpoint-Restart support with an aggregation scheme was added. Fault tolerance support was improved. Dynamic detection of multiple InfiniBand adapters was implemented, and they are used by default in multi-rail configurations. Multithreading was enhanced. There were other enhancements and bugfixes.
Release Notes: A few new features have also been added, including complete support for the FTB MPI events, improvements to RMA operations, and the ability to modify collective algorithm selection thresholds using environment variables. Many important bugs were fixed.
Release Notes: This release adds support for QLogic InfiniPath, eXtended Reliable Connections (XRC), hierarchical SSH, dynamic process management, scalable checkpoint-restart, checkpoint-restart with intra-node shared memory, kernel-level single-copy intra-node communication, and a K-nomial tree-based solution together with shared memory-based broadcast for scalable MPI_Bcast operations.