dbacl is a digramic Bayesian text classifier. Given some text, it calculates the posterior probabilities that the input resembles one of any number of previously learned document collections. It can be used to sort incoming email into arbitrary categories such as spam, work, and play, or simply to distinguish an English text from a French text. It fully supports international character sets, and uses sophisticated statistical models based on the Maximum Entropy Principle.
libbnr is an implementation of the Bayesian Noise Reduction (BNR) algorithm. All samples of text contain some degree of noise (data which is either intentionally or unintentionally irrelevant to accurate statistical analysis of the sample where removal of the data would result in a cleaner analysis). The Bayesian noise reduction algorithm provides a means of cleaner machine learning by providing more useful data, which ultimately leads to better sample analysis. With the noisy data removed from the sample, what is left is only data relevant to the classification. libbnr can be linked in with your classifier and called using the standard C interface.