194 projects tagged "Linguistic"
Konjugator helps with learning or interpreting verb forms in Welsh. It produces a list of around 200,000 inflected verb forms for almost 4,000 Welsh verbs, along with English glosses and parsing information. It attempts to conjugate Welsh verbs that are unknown to it, and will give parsing details for random Welsh verb forms if these are known to it.
Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora. Several tools have been integrated in Uplug. Pre-processing tools include a sentence splitter, tokenizer, and external part-of-speech tagger and shallow parsers. The following external tools are used: the Grok system for English (tagging and chunking) and the morphological analyzer ChaSen for Japanese. Other tools such as the TreeTagger can easily be added. Translated documents can be sentence aligned using the length-based approach by Gale & Church. Words and phrases can be aligned using the clue alignment approach and the toolbox for training statistical alignment models GIZA++.
French Verb Conjugation Rules establishes a concise and accurate set of computer readable French verb conjugation rules. The rules have been placed in an xbase database for easy access. The project is oriented towards language students, developers of computer assisted language learning software, and computational linguists.
I18N is a class that gets translation texts from flat files or from an SQL database. The system supports variables in translated strings and has a conversion facility to move data from one container to another. An included tool checks programs against sets of translated strings to detect references without strings or unused strings. Each call checks that referenced variables exist.
Sikher is a desktop program designed to archive, search, and display the Sikh scriptures using advanced functions. It allows the common person to understand and read the messages contained in the Sikh scriptures through translations and transliterations in different languages, thereby breaking the language and geographical barrier between Gurbani (Sikh Scriptures) and the world. Sikher is a robust, future proof, and cross-platform application which may be used by developers to create similar internationalized and localized search applications.
prbeditor is an editor for Java property resource bundle files. The application's intent is to help in the localization (l10n) of those programs that have been internationalized with Java's standard i18n mechanism. In contrast to other similar tools, it shows the keys and values of several languages at the same time in a spreadsheet, giving a global view of the resource files. The tool relies on the application of regular expresions to organize the keys and filter the visibility of the files. It includes a spell checker for several languages, based on word lists which may be downloaded separately.
Linguistic Tree Constructor is an application for drawing linguistic syntax trees. Its main strength is assisting in data production by quickly analyzing large amounts of text. "Generic" trees are supported, as well as RRG and X-Bar trees. Node-categories are user-definable, and additional user-definable labels can also be applied to each node. Publication-quality, high-resolution, horizontal trees can be drawn. The file format is based on TIGER-XML.
WordGenerator generates hypothetical words from specifications of their syllable structure. The user specifies the maximum length of the words in syllables, the abstract structure of syllables in the language (in terms of such units as consonants and vowels or onsets and rhymes), and the actual sounds that comprise each abstract class (e.g. the list of vowels in the language); WordGenerator then generates the words that conform to this specification. Such lists are useful to field linguists exploring the vocabulary of a language, and to designers of artificial languages.