Proper nouns is a PHP class that can extract proper nouns from texts. It takes a text string and can detect which words may be proper nouns of people or other entities. It uses some heuristics like the capitalization of the first letter of a word, the presence of a person's title preceding the nouns, etc. The class may consider consecutive proper names as a single proper name. The class assumes English by default but may be configured to work with other idioms.
libGMUVision is framework for realtime computer vision applications that is designed to allow easier access to Firewire IIDC digital video cameras. It provides a set of C++ bindings for libdc1394 and Qt widgets for building graphical applications that process video streams in real time. A small sample application (Koriander Lite, based on the Coriander project) is included.
The Spider is a complete object-oriented environment for machine learning in Matlab. Aside from ease of use for base learning algorithms, algorithms can be plugged together and can be compared with, for example, model selection, statistical tests, and visual plots. This gives all the power of objects (reusability, the ability to plug together, sharing of code), but also all the power of Matlab for machine learning research.
Bayesian Spam Filter is a class that can be used to detect spam in text messages using Bayesian techniques. It analyzes the text in terms of n-grams in a way that is idiom independent. It can be trained to progressively distinguish what is spam and what is not spam by detecting patterns in training samples. Training data is stored in a MySQL database.
The RL-Glue is intended to provide a foundation for building benchmarks for reinforcement learning (RL) agents. A secondary goal is to support RL competitions in which agents are compared in their performance on new problems that are only revealed at the time of a competition. The current version of the interface software supports direct function call communication between agents and environments written in C. The interface software also provides a standard mechanism for communication between agents and environments written in different programming languages.
PennAspect is a software implementation of the two-way aspect model, which is a latent class statistical mixture model for performing soft clustering of co-occurrence data observations. It acts on data such as document/word pairs (words occurring in documents) or movie/people pairs (people see certain movies) to produce their joint distribution estimate. The distribution is packaged as Java source and class files.
Amiba is a Gene Expression Programming (GEP) framework for Java. GEP is, like genetic algorithms, a branch of evolutionary computing. The framework separates the process of evolution from the process of interpretation of the chromosome, allowing the use of various schemes. For example, graphs may be used as terminals and graph operations as operators in the chromosome instead of the usual double precision numbers. It implements mutation, transposition, and recombination. Options and rates are easily configured through an XML file. A mechanism to load fitness cases in bulk is also provided.