chunkd is a simple storage service based on the GET, PUT, and DELETE operations. It is meant to be used in building larger, replicated, distributed storage systems. Clients connect via TCP and remotely manage storage. Typical applications include distributed filesystems and other distributed storage applications.
StarCluster is a utility for creating traditional computing clusters used in research labs or for general distributed computing applications on Amazon's Elastic Compute Cloud (EC2). It uses a simple configuration file provided by the user to request cloud resources from Amazon and to automatically configure them with a queuing system, an NFS shared /home directory, passwordless SSH, OpenMPI, and ~140GB scratch disk space. It consists of a Python library and a simple command line interface to the library. For end-users, the command line interface provides simple intuitive options for getting started with distributed computing on EC2 (i.e. starting/stopping clusters, managing AMIs, etc). For developers, the library wraps the EC2 API to provide a simplified interface for launching/terminating nodes, executing commands on the nodes, copying files to/from the nodes, etc.
Livespaces is an operating system for building advanced meeting spaces. It provides a distributed software infrastructure built on the Elvin messaging service (the Livespace Bus) for coordinating software and devices across any number of computers in a meeting space, and user-facing applications for controlling a smart meeting room and collaborating with other participants. It also supports federation with remote Livespaces to facilitate collaboration between distributed teams.
The Ex-Crawler Project is divided into three subprojects. The main part is the Ex-Crawler daemon server, a highly configurable and flexible Web crawler written in Java. It comes with its own socket server, with which you can manage the server, users, distributed grid/volunteer computing, and much more. Crawled information is stored in a database (Currently MySQL, PostgreSQL, and MSSQL are supported). The second part is a graphical (Java Swing) distributed grid/volunteer computing client, including user computer state detection, based on JADIF Project. The Web search engine is written in PHP. It comes with a Content Management System, user language detection and multi-language support, and templates using Smarty, including an application framework that is partly forked from Joomla 1.5, so that Joomla components can be adapted quickly.
Makeflow is a workflow engine for executing large complex applications on clusters, clouds, and grids. It can be used to drive several different distributed computing systems, including Condor, SGE, and the included Work Queue system. It does not require a distributed filesystem, so you can use it to harness whatever collection of machines you have available. It is typically used for scaling up data-intensive scientific applications to hundreds or thousands of cores.
POP-C++ is a comprehensive object-oriented system for developing applications in large distributed computing infrastructures such as Grid, P2P or Clouds. It consists of a programming suite (language, compiler) and a run-time system for running POP-C++ applications. The POP-C++ language is a minimal extension of C++ that implements the parallel object model with the integration of resource requirements into distributed objects. This extension is as close as possible to standard C++ so that programmers can easily learn POP-C++ and so that existing C++ libraries can be parallelized using POP-C++ without too much effort. The POP-C++ run-time is an object-oriented open design that aims at integrating different distributed computing tool kits into an infrastructure for executing requirement-driven object-oriented applications. It uses objects to serve objects: the system provides services for executing remote objects.
dispy is a Python framework for parallel execution of computations by distributing them across multiple processors in a single machine (SMP), or among many machines in a cluster or grid. The computations can be standalone programs or Python functions. dispy is well suited for the data parallel (SIMD) paradigm where a computation is evaluated with different (large) datasets independently (similar to Hadoop, MapReduce, Parallel Python). dispy features include automatic distribution of dependencies (files, Python functions, classes, modules), client-side and server-side fault recovery, scheduling of computations to specific nodes, encryption for security, sharing of computation resources if desired, and more.