SHOGUN is a machine learning toolbox whose focus is on large scale kernel methods and especially on Support Vector Machines (SVM). It provides a generic SVM object interfacing to several different SVM implementations, all making use of the same underlying, efficient kernel implementations. Apart from SVMs and regression, SHOGUN also features a number of linear methods like Linear Discriminant Analysis (LDA), Linear Programming Machine (LPM), (Kernel) Perceptrons, and algorithms to train hidden Markov models. SHOGUN can be used from within C++, Matlab, R, Octave, and Python.
Armadillo is a C++ linear algebra library (matrix maths) aiming towards a good balance between speed and ease of use. Integer, floating point, and complex numbers are supported, as well as a subset of trigonometric and statistics functions. Various matrix decompositions are provided through optional integration with LAPACK and ATLAS libraries. A delayed evaluation approach, based on template meta-programming, is used (during compile time) to combine several operations into one and reduce or eliminate the need for temporaries.
K-tree provides a scalable approach to clustering by combining the B+-tree and k-means algorithms. Clustering can be used to solve problems in signal processing, machine learning, and other contexts. It has recently been used to solve document clustering problems on the Wikipedia collection.
Thinknowlogy is grammar-based software, designed to utilize the Natural Laws of Intelligence in grammar, in order to create intelligence through natural language in software. This is demonstrated by programming in natural language, reasoning in natural language and drawing conclusions (more detailed than scientific solutions), making assumptions (with self-adjusting level of uncertainty), asking questions (about gaps in the knowledge), and detecting conflicts in the knowledge. It builds semantics autonomously (with no vocabularies or words lists), detecting some cases of semantic ambiguity. It is multi-grammar, proving that Natural Laws of Intelligence are universal.
x2search is a crawler based on machine learning algorithms that finds pages and documents that are similar to given positive and different to given negative examples. The learned classifiers can be exported and saved for later reuse. It features multiple settings for searching by domain/server, etc. and has a plug-in mechanism for adding document types to be searched.
Pinta is an extremely versatile, extensible, self-learning image classification program. It uses texture and color analysis and neural network techniques to automatically learn differences in images. It comes with a C API for easy integration into other software. It is built on top of the pattern recognition and image analysis platform Into.
scikits.learn is a Python module that integrates classic machine learning algorithms in the tightly-knit world of scientific Python packages. It aims to provide simple and efficient solutions to learning problems that are accessible to everybody and reusable in various contexts: machine-learning as a versatile tool for science and engineering.
Milk is a machine learning toolkit in Python. Its focus is on supervised classification with several classifiers available: SVMs (based on libsvm), k-NN, random forests, and decision trees. It also performs feature selection. These classifiers can be combined in many ways to form different classification systems. For unsupervised learning, milk supports k-means clustering and affinity propagation.