Hados stores files in a cluster of servers. Its goal is to handle high availability by storing copies of the same file on several nodes. It provides RESTFUL APIs to easily store, check, or retrieve files. Using the cluster APIs, you can retrieve files from whichever node hosts them. To avoid any single point of failure, it is possible to apply a request to any node of the cluster; there is no master node.
Apache UIMA DUCC (Distributed UIMA Cluster Computing) is a cluster management system providing tooling, management, and scheduling facilities that automate the scale-out of applications written using the UIMA framework. Core UIMA provides a generalized framework for applications that process unstructured information such as human language, but does not provide a scale-out mechanism. UIMA-AS extends UIMA and provides a scale-out mechanism for distributing UIMA pipelines over a cluster of computing resources, but does not provide job or cluster management of the resources. DUCC extends UIMA-AS by defining a formal job model that closely maps to a standard UIMA pipeline. Around this job model DUCC provides cluster management services to automate the scale-out of UIMA pipelines over computing clusters.