Apache OpenNLP is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services.
| Tags | NLP POS opennlp ner UIMA tagging text analysis Parser coref chunker Perceptron maxent |
|---|---|
| Licenses | Apache 2.0 |
| Operating Systems | Windows Unix Linux Solaris Mac OS X |
| Implementation | Java |
| Translations | English |
Recent releases


Release Notes: This release contains a couple of new features, improvements, and bugfixes. The maxent trainer can now run in multiple threads to utilize multi-core CPUs. Configurable feature generation was added to the name finder. The perceptron trainer was refactored and improved. Machine learners can now be configured with many more options via a parameter file. Evaluators can print out detailed evaluation information.


Release Notes: This release contains a number of improvements and bug fixes compared to the last SourceForge 1.5.0 release. Most notably, the wiki documentation was converted to docbook, the F-Measure precision was fixed, perceptron bugs were fixed, CoNLL 2003 training format support was added, chunker evaluation support was added, the chunker now supports Portuguese Bosque AD format, and the chunker was refactored.