WordNet® is an on-line lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets.
Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite are supported.
kdrill helps people learn Japanese 'Kanji' characters. Its includes a multiple-choice Kanji quiz program that helps people learn Japanese characters with different guess formats and history options. It also has a suite of dictionary lookup functions. Words can be found using a variety of methods including Romaji, SKIP, four-corner, cut-n-paste, radical lookup, and English search.
Connexor Machinese analyzers process sequences of written words, identify and classify the various entities in them, and show how these relate to each other, marking the language with a simple and systematic notation. Currently, the Machinese product family includes: Machinese Phrase Tagger, a fast, light-weight morphosyntactic tagger; Machinese Syntax, a full-scale dependency parser; Machinese Semantics, a dependency parser with semantic analysis; and Machinese Metadata, an entity extractor.
LinkGrammar-WN is a lexicon expansion for the Link Grammar Parser. The Link Grammar Parser is a syntactic parser of the English language that is capable of handling a wide variety of syntactic constructions and is considered quite robust. The LinkGrammar-WN project aims to import lexical information from WordNet in an effort to increase the size of the LGP lexicon. This project is of interest to anyone interested in NLP (natural language parsing) of English text.
Ellogon is a multi-lingual, cross-platform, general-purpose language engineering environment, developed in order to aid both researchers who are doing research in computational linguistics, as well as companies who produce and deliver language engineering systems. As a language engineering platform, it offers an extensive set of facilities, including tools for processing and visualising textual/HTML/XML data and associated linguistic information, support for lexical resources (like creating and embedding lexicons), tools for creating annotated corpora, accessing databases, comparing annotated data, or transforming linguistic information into vectors for use with various machine learning algorithms.
SenseClusters is a natural language processing package that allows you to cluster similar contexts or to identify clusters of related words. It supports its own native methods based on first and second order representations of context, and also supports Latent Semantic Analysis. It is fully unsupervised, and can automatically discover the optimal number of clusters in your text. SenseClusters is a complete system that takes users from preprocessing of raw text to providing clustered output.