RSS 194 projects tagged "Linguistic"

Download Website Updated 12 Aug 2005 Konjugator

Screenshot
Pop 11.14
Vit 1.00

Konjugator helps with learning or interpreting verb forms in Welsh. It produces a list of around 200,000 inflected verb forms for almost 4,000 Welsh verbs, along with English glosses and parsing information. It attempts to conjugate Welsh verbs that are unknown to it, and will give parsing details for random Welsh verb forms if these are known to it.

Download Website Updated 13 Oct 2006 Uplug

Screenshot
Pop 22.45
Vit 1.08

Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora. Several tools have been integrated in Uplug. Pre-processing tools include a sentence splitter, tokenizer, and external part-of-speech tagger and shallow parsers. The following external tools are used: the Grok system for English (tagging and chunking) and the morphological analyzer ChaSen for Japanese. Other tools such as the TreeTagger can easily be added. Translated documents can be sentence aligned using the length-based approach by Gale & Church. Words and phrases can be aligned using the clue alignment approach and the toolbox for training statistical alignment models GIZA++.

No download Website Updated 21 Aug 2005 French Verb Conjugation Rules

Screenshot
Pop 15.87
Vit 53.21

French Verb Conjugation Rules establishes a concise and accurate set of computer readable French verb conjugation rules. The rules have been placed in an xbase database for easy access. The project is oriented towards language students, developers of computer assisted language learning software, and computational linguists.

No download Website Updated 10 Sep 2005 I18N

Screenshot
Pop 12.85
Vit 53.02

I18N is a class that gets translation texts from flat files or from an SQL database. The system supports variables in translated strings and has a conversion facility to move data from one container to another. An included tool checks programs against sets of translated strings to detect references without strings or unused strings. Each call checks that referenced variables exist.

No download Website Updated 17 Sep 2005 Sikher

Screenshot
Pop 13.15
Vit 52.95

Sikher is a desktop program designed to archive, search, and display the Sikh scriptures using advanced functions. It allows the common person to understand and read the messages contained in the Sikh scriptures through translations and transliterations in different languages, thereby breaking the language and geographical barrier between Gurbani (Sikh Scriptures) and the world. Sikher is a robust, future proof, and cross-platform application which may be used by developers to create similar internationalized and localized search applications.

Download Website Updated 01 Oct 2005 text2phonome

Screenshot
Pop 26.55
Vit 1.00

text2phonome is a class can be used to convert English text to the respective phoneme representation. It provides options for delimiting sentences, words, and phonemes themseleves for further processing.

No download Website Updated 19 Jan 2009 Unicode.php

Screenshot
Pop 17.44
Vit 3.23

The CentralNic Unicode Library (Unicode.php) provides some PHP classes for manipulating Unicode data. These classes are general purpose, but are intended for use when working with Internationalised Domain Names (IDNs).

Download Website Updated 30 Jul 2007 prbeditor

Screenshot
Pop 58.58
Vit 4.25

prbeditor is an editor for Java property resource bundle files. The application's intent is to help in the localization (l10n) of those programs that have been internationalized with Java's standard i18n mechanism. In contrast to other similar tools, it shows the keys and values of several languages at the same time in a spreadsheet, giving a global view of the resource files. The tool relies on the application of regular expresions to organize the keys and filter the visibility of the files. It includes a spell checker for several languages, based on word lists which may be downloaded separately.

Download Website Updated 18 Sep 2010 Linguistic Tree Constructor

Screenshot
Pop 77.46
Vit 6.04

Linguistic Tree Constructor is an application for drawing linguistic syntax trees. Its main strength is assisting in data production by quickly analyzing large amounts of text. "Generic" trees are supported, as well as RRG and X-Bar trees. Node-categories are user-definable, and additional user-definable labels can also be applied to each node. Publication-quality, high-resolution, horizontal trees can be drawn. The file format is based on TIGER-XML.

Download Website Updated 12 Dec 2008 WordGenerator

Screenshot
Pop 68.74
Vit 4.11

WordGenerator generates hypothetical words from specifications of their syllable structure. The user specifies the maximum length of the words in syllables, the abstract structure of syllables in the language (in terms of such units as consonants and vowels or onsets and rhymes), and the actual sounds that comprise each abstract class (e.g. the list of vowels in the language); WordGenerator then generates the words that conform to this specification. Such lists are useful to field linguists exploring the vocabulary of a language, and to designers of artificial languages.

Screenshot

Project Spotlight

ocrodjvu

OCR support for DjVu.

Screenshot

Project Spotlight

HTTrack/WebHTTrack

An offline browser which copies Web sites to your computer.