Heartbeat is a full-function high-availability system for Linux and other POSIX-like OSes. It monitors services and restarts them on errors. When managing a cluster (more than 1 machine), it will also monitor the members of the cluster and begin recovery of lost services in less than a second. It runs over serial ports and UDP broadcast/multicast, as well as OpenAIS multicast. It is easily adapted to different interconnect media and protocols. When used in a cluster, it can operate using shared disks, data replication, or no data sharing. Versions starting with 2.0 are comparable to any commercial HA package, providing resource monitoring, larger clusters, and detailed dependency information.
mod_backhand is a load balancing module for Apache. It provides per-request HTTP redirection within a heterogeneous Apache server cluster. Each request is processed and run through a set of "candidacy functions" to determine which server is best suited to respond. The request is then proxied to that server. Facilities are in place to allow you to write your own dynamically loadable decision making algorithms. Everything about the request and the current availability of resources can be used in the decision-making process.
Wackamole is a tool that helps with making a cluster highly available. It manages a bunch of virtual IPs that should be available to the outside world at all times, and ensures that exactly one machine within the cluster is listening on each virtual IP address that Wackamole manages. If it discovers that particular machines within the cluster are not alive, it will almost immediately ensure that other machines acquire the virtual IP addresses the down machines were managing. At no time will more than one connected machine be responsible for any virtual IP.
changedfiles is a framework for filesystem replication, security monitoring, and/or automatic file transformations--essentially any application where you'd poll files or directories and either do something to them or send them somewhere else (or both). The difference is that the kernel tells you when they change instead of you having to poll. It's an easy real time FTP push mirror to one or multiple sites. It's also a full fledged MySQL client, so you can do realtime database operations (for example, batch imports). It consists of two parts: a kernel module (works with Linux kernel version 2.4) which reports to a device whenever a file on the filesystem changes, and a daemon which runs in user space and can be configured to do almost any action when a change to a file matching the one of the patterns it looks for is reported. The kernel module is SMP safe and has been tested on Intel, PowerPC, and Alpha.
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and grids. It is based on a hierarchical design targeted at federations of clusters. Ganglia is currently in use on over 500 clusters around the world and has scaled to handle clusters with 2000 nodes.
openMosix is a a set of extensions to the standard Linux kernel allowing you to build a cluster of out of off-the-shelf PC hardware. openMosix scales perfectly up to thousands of nodes. You do not need to modify your applications to benefit from your cluster (unlike PVM, MPI, Linda, etc.). Processes in openMosix migrate transparently between nodes and the cluster will always auto-balance.
AutoNOC is a high performance, production integrated, peer-to-peer network operations management platform for Windows and Linux. It provides real-time historical analysis, root cause, fault detection, reporting, alerts and alarms, and no-nonsense correlation. It is an interoperable vendor independent solution with built-in support for Microsoft, Cisco, Linux, IBM, and other major technologies. Additionally it offers many novel capabilities, including end user personalization, easy scalability, compressed historical databases, infinite histories, event archiving (it works as a syslog server), and multi-language support.