Apache UIMA DUCC (Distributed UIMA Cluster Computing) is a cluster management system providing tooling, management, and scheduling facilities that automate the scale-out of applications written using the UIMA framework. Core UIMA provides a generalized framework for applications that process unstructured information such as human language, but does not provide a scale-out mechanism. UIMA-AS extends UIMA and provides a scale-out mechanism for distributing UIMA pipelines over a cluster of computing resources, but does not provide job or cluster management of the resources. DUCC extends UIMA-AS by defining a formal job model that closely maps to a standard UIMA pipeline. Around this job model DUCC provides cluster management services to automate the scale-out of UIMA pipelines over computing clusters.
Nuxis is an integrated solution for virtualization management. Some of its features are centralized management of nodes/physical machines and virtual machines, management of virtual networks, storage management, ISO management, monitoring and statistics charts, backup/restore of appliance configurations, import from and export to other virtualization systems using the OVF format, access control, support for multiple operating systems on 32-bit and 64-bit architectures, including Linux and Windows, paravirtualized hardware acceleration drivers, live migrate, PXE boot, Web management, storage management with LVM, and more.
wsrep-enabled MySQL (previously MySQL/Galera cluster) can use wsrep replication providers, such as Galera, to form a cluster. wsrep API is an abstract replication interface that supports global transaction ID, true multi-master capability, conflict detection and resolution, and parallel applying and is transparent to triggers, stored procedures, and functions by replicating only final transaction results. Only the InnoDB storage engine and DDL commands are supported by this patch.
Makeflow is a workflow engine for executing large complex applications on clusters, clouds, and grids. It can be used to drive several different distributed computing systems, including Condor, SGE, and the included Work Queue system. It does not require a distributed filesystem, so you can use it to harness whatever collection of machines you have available. It is typically used for scaling up data-intensive scientific applications to hundreds or thousands of cores.
Hados stores files in a cluster of servers. Its goal is to handle high availability by storing copies of the same file on several nodes. It provides RESTFUL APIs to easily store, check, or retrieve files. Using the cluster APIs, you can retrieve files from whichever node hosts them. To avoid any single point of failure, it is possible to apply a request to any node of the cluster; there is no master node.
LavaFlow creates useful reports on the usage of high-performance computing clusters. It takes data from the batch scheduling system, monitoring, and other tooling, and creates reports which help administrators, managers, and end users better understand their cluster environment. The reports are modular, and new modules are easy to create using templates and Django's query set API. LavaFlow uses human-readable RESTful URLs, making it easy to automate and share links to reports.
JASocket is a lock-free, scalable, and robust server framework with no single point of failure. Servers are run on a cluster of nodes. Servers interact with other servers using mobile agents, which reduces the number of messages and thus reduces the overall system latency. Administration is handled via ssh.
JAConfig implements an eventually consistent distributed key/value database for managing a JASocket cluster. Also included are Quarum for tracking when a quorum of hosts is present, Ranker for determining which nodes are least loaded, ClusterManager for starting up other servers, and Kingmaker, which decides which node is to run ClusterManager. JAConfig is lock-free, actor-based, and has no single point of failure.