Dacco is a collaborative English-Catalan, Catalan-English dictionary project. It seeks to provide an up-to-date, comprehensive, bilingual dictionary that will be of benefit to learners of both languages. The dictionaries are downloadable and customizable (using XSLT) and contain audio files.
LinkGrammar-WN is a lexicon expansion for the Link Grammar Parser. The Link Grammar Parser is a syntactic parser of the English language that is capable of handling a wide variety of syntactic constructions and is considered quite robust. The LinkGrammar-WN project aims to import lexical information from WordNet in an effort to increase the size of the LGP lexicon. This project is of interest to anyone interested in NLP (natural language parsing) of English text.
MegaLettering is the PHP engine created to manage the Italian translation of www.megatokyo.com, but it is written with general use in mind, so it can support any number of languages. Text in baloons can be translated by using a MySQL database that defines both the balloon shapes and the translated text and fonts to use to add new text.
Convert character set is meant to convert text strings between different character set encodings. It features conversion between single byte character sets, from single byte to multi-byte character sets (UTF-8), and from multi-byte to single byte. All conversion output can be saved with numeric entities (browser character set independent). The main requirement is that a character has to be in both character sets, or it will return an error.
PolyGen is a program for generating random sentences according to a grammar definition, that is following custom syntactical and lexical rules. Formally, it is an interpreter of a language itself designed to define languages, where to interpret means executing a source program in real time and eventually outputting its result. Here, a source program is a grammar definition. The execution consists of the exploration of such grammar by selecting a random path, and the result is the sentence built on the way.
An Gramadóir is a grammar checking engine that is designed for the rapid development of grammar checkers for minority languages and other languages with limited computational resources. Rule specifications are given according to a simple syntax combining XML and regular expressions. Part-of-speech tagging can be learned from text corpora using statistical methods. It is currently implemented for Irish (Gaeilge).
Ellogon is a multi-lingual, cross-platform, general-purpose language engineering environment, developed in order to aid both researchers who are doing research in computational linguistics, as well as companies who produce and deliver language engineering systems. As a language engineering platform, it offers an extensive set of facilities, including tools for processing and visualising textual/HTML/XML data and associated linguistic information, support for lexical resources (like creating and embedding lexicons), tools for creating annotated corpora, accessing databases, comparing annotated data, or transforming linguistic information into vectors for use with various machine learning algorithms.