PolyGen is a program for generating random sentences according to a grammar definition, that is following custom syntactical and lexical rules. Formally, it is an interpreter of a language itself designed to define languages, where to interpret means executing a source program in real time and eventually outputting its result. Here, a source program is a grammar definition. The execution consists of the exploration of such grammar by selecting a random path, and the result is the sentence built on the way.
Apertium is a machine translation platform, initially aimed at related-language pairs, but recently expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides a language-independent machine translation engine, tools to manage the linguistic data necessary to build a machine translation system for a given language pair, and linguistic data for a growing number of language pairs.
Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite are supported.
Linguistic Tree Constructor is an application for drawing linguistic syntax trees. Its main strength is assisting in data production by quickly analyzing large amounts of text. "Generic" trees are supported, as well as RRG and X-Bar trees. Node-categories are user-definable, and additional user-definable labels can also be applied to each node. Publication-quality, high-resolution, horizontal trees can be drawn. The file format is based on TIGER-XML.
Xlit converts text from one writing system into another. It allows the user to define a transliteration simply by typing the input strings in one window and the strings to which they are to be mapped in another. Transliteration may be restricted to regions bounded by specified delimiters or their complements. Transliteration may also be performed by external commands or plugins. Xlit can also convert one type of delimiter to another, e.g. from HZ escapes to XML. Xlit can read and write transliteration definitions in its own format and as Yudit keymaps. It can be run in batch mode without the GUI.
The GNU Talk Filters are filter programs that convert ordinary English text into text that mimics a stereotyped or otherwise humorous dialect. Some of these filters have been in the public domain for many years, but here they are provided as a single integrated package. The filters include austro, b1ff, brooklyn, chef, cockney, drawl, dubya, fudd, funetak, jethro, jive, kraut, pansy, pirate, postmodern, redneck, valspeak, and warez. This package provides the filters both as individual executables and collectively as a C library, so they can be easily embedded in other programs.
Connexor Machinese analyzers process sequences of written words, identify and classify the various entities in them, and show how these relate to each other, marking the language with a simple and systematic notation. Currently, the Machinese product family includes: Machinese Phrase Tagger, a fast, light-weight morphosyntactic tagger; Machinese Syntax, a full-scale dependency parser; Machinese Semantics, a dependency parser with semantic analysis; and Machinese Metadata, an entity extractor.
Esperantilo ("Tool for Esperanto") is a UTF-8 editor with linguistics functions for the language Esperanto, and is also a system for computer aided translation. It contains a spell checker and grammar checker for the Esperanto language. It can translate Esperanto text in different formats to Polish, German, English, and Swedish and from Polish and English. It also supports computer aided translation by interactive machine translation. Translation memory can be used also for any language pairs. It is an XLIFF editor. It supports XLIFF and TMX (Level 1) formats. Machine translation uses direct translation at the syntax level.