scikits.learn is a Python module that integrates classic machine learning algorithms in the tightly-knit world of scientific Python packages. It aims to provide simple and efficient solutions to learning problems that are accessible to everybody and reusable in various contexts: machine-learning as a versatile tool for science and engineering.
Treba is a commandline tool for training, decoding, and calculating with weighted (probabilistic) finite state automata (WFSA/PFSA). Training algorithms include Baum-Welch (EM), Viterbi training, and Baum-Welch augmented with deterministic annealing. Treba is optimized for speed and numerical stability, and training algorithms can be run multi-threaded on hardware with multiple cores/CPUs. Forward, backward, and Viterbi decoding are supported. Automata for training/decoding are read from a text file, or can be generated randomly or with uniform transition probabilities with different topologies (ergodic or fully connected, Bakis or left-to-right, or deterministic). Observations used for training or decoding are read from text files compatible with AT&T finite state tools and OpenFST.
Milk is a machine learning toolkit in Python. Its focus is on supervised classification with several classifiers available: SVMs (based on libsvm), k-NN, random forests, and decision trees. It also performs feature selection. These classifiers can be combined in many ways to form different classification systems. For unsupervised learning, milk supports k-means clustering and affinity propagation.
MyMediaLite is a lightweight, multi-purpose library of recommender system algorithms. It addresses the two most common scenarios in collaborative filtering: rating prediction (e.g. on a scale of 1 to 5 stars), and item prediction from implicit feedback (e.g. from clicks or purchase actions). It contains dozens of recommender engines, including state-of-the-art matrix factorization methods. It also supports real-time updates to the recommender engines, storing engines to disk and reloading them again, and several evaluation measures to compare the accuracy of different recommender system methods. Three command-line programs that offer most of the functionality contained in the library are included.
Thinknowlogy is grammar-based software designed to utilize the logic contained within grammar in order to create intelligence through a natural language, which is demonstrated by programming in a natural language, reasoning in a natural language (drawing conclusions, making assumptions (with a self-adjusting level of uncertainty), asking questions (about gaps in the knowledge), and detecting conflicts), and intelligent answering of "is" questions, providing alternative answers as well.
Pinta is an extremely versatile, extensible, self-learning image classification program. It uses texture and color analysis and neural network techniques to automatically learn differences in images. It comes with a C API for easy integration into other software. It is built on top of the pattern recognition and image analysis platform Into.
K-tree provides a scalable approach to clustering by combining the B+-tree and k-means algorithms. Clustering can be used to solve problems in signal processing, machine learning, and other contexts. It has recently been used to solve document clustering problems on the Wikipedia collection.