RSS 29 projects tagged "Linguistic"

Download Website Updated 03 Feb 2004 Polygen

Screenshot
Pop 26.91
Vit 61.06

PolyGen is a program for generating random sentences according to a grammar definition, that is following custom syntactical and lexical rules. Formally, it is an interpreter of a language itself designed to define languages, where to interpret means executing a source program in real time and eventually outputting its result. Here, a source program is a grammar definition. The execution consists of the exploration of such grammar by selecting a random path, and the result is the sentence built on the way.

No download Website Updated 21 Sep 2004 Atlantida

Screenshot
Pop 13.00
Vit 59.13

Atlantida is a multilingual cross-platform dictionary. Currently it has 310,000 definitions, and knows how to pronounce 21,000 English words.

Download Website Updated 31 Aug 2009 Apertium

Screenshot
Pop 37.19
Vit 46.60

Apertium is a machine translation platform, initially aimed at related-language pairs, but recently expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides a language-independent machine translation engine, tools to manage the linguistic data necessary to build a machine translation system for a given language pair, and linguistic data for a growing number of language pairs.

Download Website Updated 04 Jul 2011 Emdros

Screenshot
Pop 157.02
Vit 16.81

Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite are supported.

No download Website Updated 02 Aug 2012 Poliqarp

Screenshot
Pop 54.92
Vit 7.83

Poliqarp is a universal suite of utilities for processing large corpora. It includes a concordancer that works on binary corpora compiled for efficient searching and a corpus builder. It supports positional tagsets, ambiguities in the texts, and Unicode.

Download Website Updated 18 Sep 2010 Linguistic Tree Constructor

Screenshot
Pop 64.44
Vit 5.52

Linguistic Tree Constructor is an application for drawing linguistic syntax trees. Its main strength is assisting in data production by quickly analyzing large amounts of text. "Generic" trees are supported, as well as RRG and X-Bar trees. Node-categories are user-definable, and additional user-definable labels can also be applied to each node. Publication-quality, high-resolution, horizontal trees can be drawn. The file format is based on TIGER-XML.

Download Website Updated 30 Dec 2008 xlit

Screenshot
Pop 69.24
Vit 5.35

Xlit converts text from one writing system into another. It allows the user to define a transliteration simply by typing the input strings in one window and the strings to which they are to be mapped in another. Transliteration may be restricted to regions bounded by specified delimiters or their complements. Transliteration may also be performed by external commands or plugins. Xlit can also convert one type of delimiter to another, e.g. from HZ escapes to XML. Xlit can read and write transliteration definitions in its own format and as Yudit keymaps. It can be run in batch mode without the GUI.

Download Website Updated 29 Dec 2007 GNU Talk Filters

Screenshot
Pop 110.43
Vit 4.80

The GNU Talk Filters are filter programs that convert ordinary English text into text that mimics a stereotyped or otherwise humorous dialect. Some of these filters have been in the public domain for many years, but here they are provided as a single integrated package. The filters include austro, b1ff, brooklyn, chef, cockney, drawl, dubya, fudd, funetak, jethro, jive, kraut, pansy, pirate, postmodern, redneck, valspeak, and warez. This package provides the filters both as individual executables and collectively as a C library, so they can be easily embedded in other programs.

No download Website Updated 25 Sep 2006 Connexor Machinese

Screenshot
Pop 30.45
Vit 3.90

Connexor Machinese analyzers process sequences of written words, identify and classify the various entities in them, and show how these relate to each other, marking the language with a simple and systematic notation. Currently, the Machinese product family includes: Machinese Phrase Tagger, a fast, light-weight morphosyntactic tagger; Machinese Syntax, a full-scale dependency parser; Machinese Semantics, a dependency parser with semantic analysis; and Machinese Metadata, an entity extractor.

No download Website Updated 23 Dec 2007 Esperantilo

Screenshot
Pop 30.74
Vit 3.11

Esperantilo ("Tool for Esperanto") is a UTF-8 editor with linguistics functions for the language Esperanto, and is also a system for computer aided translation. It contains a spell checker and grammar checker for the Esperanto language. It can translate Esperanto text in different formats to Polish, German, English, and Swedish and from Polish and English. It also supports computer aided translation by interactive machine translation. Translation memory can be used also for any language pairs. It is an XLIFF editor. It supports XLIFF and TMX (Level 1) formats. Machine translation uses direct translation at the syntax level.

Screenshot

Project Spotlight

Embedthis Appweb

A fast little Web server for embedding.

Screenshot

Project Spotlight

Samba

Tools to access to a server's filespace and printers via SMB.