ClodHopper is a Java library for high-performance clustering of numerical data. It contains clustering implementations such as K-Means, K-Means++, X-Means, G-Means, Fuzzy C-Means, Jarvis-Patrick, and various forms of hierarchical clustering. ClodHopper's clustering implementations take advantage of the host system's concurrent processing ability to speed clustering. The data structures are also very lean to conserve memory usage. ClodHopper is very extensible. If you are developing a new clustering algorithm, you may save yourself an enormous amount of work by extending a ClodHopper base class.
DEMUX Framework enables Java developers to build modular, cross-platform applications which can run on desktop, Web, and mobile and embedded devices. It is based on OSGI and supports creating JavaFX desktop applications, mobiles apps (Android, iOS, Windows), and Web applications.
Universal File Mover (UFM) manages the transfer of files. The user combines a series of Action commands to create the UFM Workflow XML file. These Action commands define which actions are to be taken, the order of the actions, and how errors are to be handled. UFM processes the Action commands as per the UFM Workflow XML file. UFM currently contains 41 Action commands. These action commands fall into five categories: WebSphere MQ Actions, Network Actions, File Actions, Control Actions, and Other Actions. UFM can transfer files in one of five ways, using WebSphere MQ, FTP, SFTP, SCP, or HTTP.
uma::bson is a DOM-style C++ API for reading/writing BSON data. Unlike the MongoDB C++ API, which exposes a read-only interface with a separate interface for creating a BSON representation, this API allows reading/writing on the existing data. The API is designed primarily for serialising/deserialising BSON data to/from streams (files, socket connections, etc.).
Apache OpenNLP is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services.
TIXI is a fast and simple XML interface library for applications written in C, C++, Fortran, Java, and Python. Although simplified and somewhat restricted compared to a fully-fledged XML processing library, it can create documents, create and delete nodes, and add and remove element attributes. Routines for reading and writing text nodes and nodes holding integer and floating point numbers are included, along with routines that process aggregates of these simple types for the processing of geometric data, multidimensional arrays, or arrays of vectors.
The Exquisite `df' (xdf) is a souped-up version of df(1) rewritten from scratch and focused on flexibility of field selection and output format. It offers HTML and CSV outputs, besides the traditional text-based console output. It is fit for system administrators who are tired of post-processing df(1) output through shell or Perl scripts in order to avoid broken lines or to get a simple total/summary line.