Makeflow is a workflow engine for executing large complex applications on clusters, clouds, and grids. It can be used to drive several different distributed computing systems, including Condor, SGE, and the included Work Queue system. It does not require a distributed filesystem, so you can use it to harness whatever collection of machines you have available. It is typically used for scaling up data-intensive scientific applications to hundreds or thousands of cores.
Parrot and Chirp are user-level tools that make it easy to rapidly deploy wide area filesystems. Parrot is the client component: it transparently attaches to unmodified applications, and redirects their system calls to various remote servers. A variety of controls can be applied to modify the namespace and resources available to the application. Chirp is the server component: it allows an ordinary user to easily export and share storage across the wide area with a single command. A rich access control system allows users to mix and match multiple authentication types. Parrot and Chirp are most useful in the context of large scale distributed systems such as clusters, clouds, and grids where one may have limited permissions to install software.
Hados stores files in a cluster of servers. Its goal is to handle high availability by storing copies of the same file on several nodes. It provides RESTFUL APIs to easily store, check, or retrieve files. Using the cluster APIs, you can retrieve files from whichever node hosts them. To avoid any single point of failure, it is possible to apply a request to any node of the cluster; there is no master node.
The Shared Scientific Toolbox is a library that facilitates development of efficient, modular, and robust scientific/distributed computing applications in Java. It features multidimensional arrays with extensive linear algebra and FFT support, an asynchronous, scalable networking layer, and advanced class loading, message passing, and statistics packages.
Wisecracker is a high performance distributed cryptanalysis framework that leverages GPUs and multiple CPUs. It allows security researchers to write their own cryptanalysis tools that can distribute brute-force cryptanalysis work across multiple systems with multiple multi-core processors and GPUs. Security researchers can also use the sample tools provided out-of-the-box. The differentiating aspect of Wisecracker is that it uses OpenCL and MPI together to distribute the work across multiple systems, each having multiple CPUs and/or GPUs.
dispy is a Python framework for parallel execution of computations by distributing them across multiple processors in a single machine (SMP), or among many machines in a cluster or grid. The computations can be standalone programs or Python functions. dispy is well suited for the data parallel (SIMD) paradigm where a computation is evaluated with different (large) datasets independently (similar to Hadoop, MapReduce, Parallel Python). dispy features include automatic distribution of dependencies (files, Python functions, classes, modules), client-side and server-side fault recovery, scheduling of computations to specific nodes, encryption for security, sharing of computation resources if desired, and more.
POP-C++ is a comprehensive object-oriented system for developing applications in large distributed computing infrastructures such as Grid, P2P or Clouds. It consists of a programming suite (language, compiler) and a run-time system for running POP-C++ applications. The POP-C++ language is a minimal extension of C++ that implements the parallel object model with the integration of resource requirements into distributed objects. This extension is as close as possible to standard C++ so that programmers can easily learn POP-C++ and so that existing C++ libraries can be parallelized using POP-C++ without too much effort. The POP-C++ run-time is an object-oriented open design that aims at integrating different distributed computing tool kits into an infrastructure for executing requirement-driven object-oriented applications. It uses objects to serve objects: the system provides services for executing remote objects.
Dapper, or "Distributed and Parallel Program Execution Runtime", is a tool for taming the complexities of developing for large-scale cloud and grid computing, enabling the user to create distributed computations from the essentials: the code that will execute, along with a dataflow graph description. It supports rich execution semantics, carefree deployment, a robust control protocol, modification of the dataflow graph at runtime, and an intuitive user interface.