Projects / libTextCat

libTextCat

Libtextcat is a library with functions that implement the classification technique described in Cavnar & Trenkle, "N-Gram-Based Text Categorization". It was primarily developed for language guessing, a task on which it is known to perform with near- perfect accuracy. Considerable effort went into making this implementation fast and efficient. The language guesser processes over 100 documents/second on a simple PC, which makes it practical for many uses.

Tags
Licenses
Operating Systems
Implementation

Recent releases

  •  05 Dec 2003 19:03

    Release Notes: A long overdue autoconfig script has been added.

    •  20 May 2003 13:35

      Release Notes: The distribution now contains Gertjan van Noord's language models for the automatic recognition of over 70 languages. The makefiles were cleaned up to make them more portable.

      Screenshot

      Project Spotlight

      OpenStack4j

      A Fluent OpenStack client API for Java.

      Screenshot

      Project Spotlight

      TurnKey TWiki Appliance

      A TWiki appliance that is easy to use and lightweight.