Release Notes: This version of Machinese Phrase Tagger introduces an enhanced custom lexicon mechanism and provides various improvements in analysis.
Release Notes: This version offers language identification and noun phrase detection. A command line utility and TCP/IP socket server executable are now also available on the MS Windows platform. Program output formats have been revised: output format options include now XML output. Handling of XML input and imperfect text input has been enhanced. Sentence boundaries are now shown in the output. The language models have been updated, and speed and memory usage have been improved.
Release Notes: There are now two new supported languages: Danish and Norwegian. This version introduces a custom lexicon facility to allow users to add new vocabulary to the analyzer. The Unix/Linux versions now support output as prose instead of tags and allow new types of preformatting tokens.
Release Notes: Name identification has been further improved. General analysis quality has been improved. Danish and Norwegian have been added to the Machinese language palette. Memory consumption is now smaller. Input handling has been improved.
Release Notes: All the supported languages (English, French, German, Spanish, Italian, Dutch, Swedish and Finnish) now have a common superset of morphosyntactic tags for the analysis, thus only minimal reprogramming is needed when changing the language from one to another. The Machinese Phrase Tagger API has a new class for extracting noun phrases and new easy-to-use data structures in library versions. Version 4.0 also introduces Unicode support, thread-safety, and speed improvements.
No changes have been submitted for this release.