NLTK, the Natural Language Toolkit, is a suite of Python libraries and programs for symbolic and statistical natural language processing. NLTK includes graphical demonstrations and sample data. It is accompanied by extensive documentation, including tutorials that explain the underlying concepts behind the language processing tasks supported by the toolkit.
|Tags||Scientific/Engineering education Utilities Software Development|
|Licenses||IBM Public GPL|
|Operating Systems||POSIX Linux Windows Windows Unix|
No changes have been submitted for this release.
Release Notes: Some significant changes were made to NLTK's basic architecture. These changes make the basic processing tasks easier to use, and make it easier to combine different processing tasks into a single system.
Release Notes: This version adds four new corpora and corpus readers (the names corpus, stopwords corpus, semcor corpus, and wordnet corpus), adds several new modules in nltk- contrib, splits nltk.token into two modules: nltk.token defines Token and Location, and nltk.tokenizer defines tokenizers, adds many new modules to nltk-contrib, adds a look-ahead window for sequential tagging, and fixes various bugs.
Release Notes: This version adds two new packages: nltk-data, a package containing sample datasets, and nltk-contrib, a package containing third party contributions that have not (yet) been incorporated into the toolkit. It also includes significant improvements to the documentation, including new tutorials, revised tutorials, and improved API documentation. It adds a new module that defines a standard interface for stemmers, and implements the Porter stemmer. It also contains several improvements to the graphical demos.
Release Notes: An overhaul of nltk.probability was completed. The Tagger module design was updated to allow for better backoff. Many tutorials are new or updated (regexp, tagging, probability, and intro). 2 kinds of chart edges are distinguished: token edges (used to initialize the chart), and production edges. Assorted minor improvements were also made.