slkvm is an application to provide some system tools to work with clustering and virtualization. It focuses on depending on as few external tools as possible but to also support as many virtualization technologies as possible. It works in a cluster environment where heartbeat runs virtual machines of nodes that have failed. It builds an "unheaded" cluster to avoid having a clear point of failure. It is able to build a two node cluster with everything redundant. It avoids compiling a new kernel or newer version of applications, so you can benefit from Debian security updates.
wsrep-enabled MySQL (previously MySQL/Galera cluster) can use wsrep replication providers, such as Galera, to form a cluster. wsrep API is an abstract replication interface that supports global transaction ID, true multi-master capability, conflict detection and resolution, and parallel applying and is transparent to triggers, stored procedures, and functions by replicating only final transaction results. Only the InnoDB storage engine and DDL commands are supported by this patch.
OpenSVC is a 'service' manager, as in clustered service manager. Services are described as collections of resources (IP, disk groups, filesystems, file synchronizations, and application launchers). Services can be started, stopped and queried for status, providing a consistent command set for wildly different service integrations. Services can be administered using a stand-alone free software stack deployed on the nodes (nodeware). Service configurations, status, and logs are pushed to a central database coupled to a Web front-end (collector).
Lingua Server is a fast multi-protocol application service server in Java. Its main features are very fast response times, a pluggable protocol and transport architecture, very easy application service development (MVC), built-in cluster support using multicast, auto-discovery, and distributed shared objects.
Makeflow is a workflow engine for executing large complex applications on clusters, clouds, and grids. It can be used to drive several different distributed computing systems, including Condor, SGE, and the included Work Queue system. It does not require a distributed filesystem, so you can use it to harness whatever collection of machines you have available. It is typically used for scaling up data-intensive scientific applications to hundreds or thousands of cores.
Gossimon is a gossip-based distributed monitoring system for a cluster of Linux nodes. Each node in the cluster periodically send information about itself and others to a randomly selected node. This way, each node constantly receives information about cluster nodes. This information is locally maintained (constantly updated) by each node and can be used by various clients for monitoring and resource allocation. The gossip protocols used by gossimon are very robust to node failure, and the information quality is hardly degraded even when large parts of the cluster are taken down. The package contains: infod (the daemon responsible for collecting and sending information for other nodes); mmon (a curses-based monitoring client displaying information about cluster nodes); and infod-client (a command-line client that retrieves cluster information in XML format).
Libfairydust is a small wrapper library intended for use with GPU clusters that 'hijacks' CUDA and OpenCL calls. It can be used to 're-route' calls to a certain GPU, so a process requesting GPU#0 might end up running on GPU#4 without knowing (or caring) about it. This works completely transparently and does not need any sort of 'cooperation' from the application, changes to code, or relinking.