Release Notes: This release adds bugfixes in relation to the release candidate.
Release Notes: This release fixes a data validation issue in GPU transfers, tunes CUDA block size to 256K for better performance, enhances error checking for CUDA library calls, and fixes a mpirun_rsh issue while launching applications on Linux Kernels.
Release Notes: New features include improved support for fault tolerance, support for the ARMCI API, and non-collective group creation functionality. There are numerous bugfixes.
Release Notes: Enhancements include a Nemesis-based interface, process manager support, rail binding with processes for multirail configurations, message coalescing, large data transfers, dynamic process migration, fast process-level fault-tolerance with checkpoint-restart, network-level fault-tolerance with Automatic Path Migration, RDMA CM support, iWARP support, optimized collectives, multi-pathing, RDMA read- and write-based designs, polling and blocking-based communication progress, multi-core optimized and scalable shared memory support, and LiMIC2-based kernel-level shared memory support.
Release Notes: The Shared-Memory-Nemesis interface was added, providing native shared memory support on multi-core platforms where communication is required only within a node. Support for 3D torus topology with appropriate SL settings was added. Quality of Service (QoS) support with multiple InfiniBand SL was added. GPU acceleration support was added. Fast Checkpoint-Restart support with an aggregation scheme was added. Fault tolerance support was improved. Dynamic detection of multiple InfiniBand adapters was implemented, and they are used by default in multi-rail configurations. Multithreading was enhanced. There were other enhancements and bugfixes.