Erudite is an application for training and testing back propogation neural networks using the ANNeML (Artifical Neural Network Markup Language) XML format. It supports testing and training neural nets with CSV files and has support for randomized training sets, optional adapting learning rate, sigmoid or hyperbolic tangent transfer functions, optional bias and weight adjustment locking, and more.
Weed-FS is a simple and highly scalable distributed file system. There are two objectives: to store billions of files, and to serve the files fast! Instead of supporting full POSIX file system semantics, it implements only a key-file mapping. Instead of managing all file metadata in a central master, it manages file volumes in the central master and lets volume servers manage files and the metadata. This relieves concurrency pressure from the central master and spreads file metadata into volume servers' memories, allowing faster file access with just one disk read operation. It is modelled on Facebook's Haystack design paper. Only 40 bytes of disk storage are required for each file's metadata, and disk reads are O(1).
Overthere is a Java library to manipulate files and execute processes on remote hosts, i.e. do stuff "over there". It was built for and is used in the XebiaLabs deployment automation product Deployit as a way to perform tasks on remote hosts, e.g. copy configuration files, install EAR files, or restart Web servers. Another way of looking at it is to say that Overthere gives you java.io.File and java.lang.Process as they should have been: as interfaces, created by a factory and extensible through an SPI mechanism.
RunDeck is a command automation hub that helps you automate ad-hoc and routine procedures in data center or cloud environments. It allows you to run tasks on any number of nodes from a Web-based or command-line interface. It also includes other features that make it easy to scale up your scripting efforts, including access control, workflow building, scheduling, logging, and integration with external sources for node and context data.
Apache OpenNLP is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services.
District Builder is a software application designed to give the public transparent, accessible, and easy-to-use mapping tools to draw the boundaries of their communities or to generate redistricting plans for their state and localities. The drawing of electoral districts is among the least transparent processes in democratic governance. All too often, redistricting authorities maintain power by obstructing public participation. The resulting districts embody the goals of politicians to the detriment of the representational interests of communities and the public at large. With District Builder, the public has the capacity to create and submit district plans for municipal, county, and state governments, support redistricting competitions, and keep the entire process open. In addition to legislative redistricting, District Builder's flexibility accommodates school districts, police districts, and many other redistricting needs.