RSS 194 projects tagged "Linguistic"

Download Website Updated 11 Apr 2007 spell-uk

Screenshot
Pop 27.93
Vit 2.94

spell-uk is a Ukrainian dictionary for aspell, myspell, and ispell.

Download Website Updated 22 Mar 2007 SlpTK

Screenshot
Pop 17.61
Vit 1.63

SlpTK is an ANSI C library, a set of utilities, and scripts for natural language processing. It provides data structures and treatments related to lexical and syntactic levels.

Download Website Updated 04 Feb 2007 Japana

Screenshot
Pop 56.58
Vit 4.17

Japana is a small HTTP proxy written in Perl. It converts Japanese characters (Hiragana, Katakana, and Kanji) into ASCII (Romaji) on the fly. The translation is done with the kakasi library (an older version without the need for kakasi still exists).

Download Website Updated 28 Jan 2007 ByteName

Screenshot
Pop 52.15
Vit 3.98

ByteName is a tool that for each byte of the input prints a line consisting of the byte offset, the byte in hex, octal, binary, and decimal, and its description in a selected single-byte encoding. A command line flag suppresses printing of lines corresponding to ASCII characters, which is useful for locating stray non-ASCII codes. It can also generate a chart for a specified encoding or, for a specified codepoint, generate descriptions in all known encodings.

Download Website Updated 17 Jan 2007 ddc-concordance

Screenshot
Pop 36.44
Vit 2.31

ddc-concordance is a search engine for linguists. It lets you search for words or sequences of words together with morphological patterns. It was created to help linguists find a particular collocation or word in a given context.

No download Website Updated 13 Jan 2007 Linguistico

Screenshot
Pop 12.49
Vit 48.17

Linguistico is a set of language tools based on the Italian language. It includes a dictionary, a thesaurus, word definitions, and other scripts and programs. You can use these tools with OpenOffice.org, Mozilla Thunderbird, Mozilla Firefox, MySpell, MyThes, Aspell, and HunSpell.

Download No website Updated 05 Jan 2007 Tamil Karuvi

Screenshot
Pop 8.12
Vit 1.00

Tamizh Karuvi is an English to Tamil transliteration tool. It can be used to provide Unicode UTF-8 input to programs within GNOME/GTK+ environments. The software provides a nice GUI and a command line program. The encoding table used is from the standard JaffnaLibrary.

Download Website Updated 16 Dec 2006 pynumwords

Screenshot
Pop 11.49
Vit 1.00

pynumwords is a Python library for converting numbers into words. The library currently supports N-base systems, Roman numbers, Morse code, English, Chinese, Hebrew, and Lithuanian.

Download Website Updated 13 Oct 2006 Uplug

Screenshot
Pop 22.45
Vit 1.08

Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora. Several tools have been integrated in Uplug. Pre-processing tools include a sentence splitter, tokenizer, and external part-of-speech tagger and shallow parsers. The following external tools are used: the Grok system for English (tagging and chunking) and the morphological analyzer ChaSen for Japanese. Other tools such as the TreeTagger can easily be added. Translated documents can be sentence aligned using the length-based approach by Gale & Church. Words and phrases can be aligned using the clue alignment approach and the toolbox for training statistical alignment models GIZA++.

No download Website Updated 25 Sep 2006 Connexor Machinese

Screenshot
Pop 30.94
Vit 3.99

Connexor Machinese analyzers process sequences of written words, identify and classify the various entities in them, and show how these relate to each other, marking the language with a simple and systematic notation. Currently, the Machinese product family includes: Machinese Phrase Tagger, a fast, light-weight morphosyntactic tagger; Machinese Syntax, a full-scale dependency parser; Machinese Semantics, a dependency parser with semantic analysis; and Machinese Metadata, an entity extractor.

Screenshot

Project Spotlight

QtitanDataGrid

QtitanDataGrid is a grid for business application in Qt.

Screenshot

Project Spotlight

Rodent filemanager

An advanced multi-threaded file manager.