French Verb Conjugation Rules establishes a concise and accurate set of computer readable French verb conjugation rules. The rules have been placed in an xbase database for easy access. The project is oriented towards language students, developers of computer assisted language learning software, and computational linguists.
Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora. Several tools have been integrated in Uplug. Pre-processing tools include a sentence splitter, tokenizer, and external part-of-speech tagger and shallow parsers. The following external tools are used: the Grok system for English (tagging and chunking) and the morphological analyzer ChaSen for Japanese. Other tools such as the TreeTagger can easily be added. Translated documents can be sentence aligned using the length-based approach by Gale & Church. Words and phrases can be aligned using the clue alignment approach and the toolbox for training statistical alignment models GIZA++.
Konjugator helps with learning or interpreting verb forms in Welsh. It produces a list of around 200,000 inflected verb forms for almost 4,000 Welsh verbs, along with English glosses and parsing information. It attempts to conjugate Welsh verbs that are unknown to it, and will give parsing details for random Welsh verb forms if these are known to it.
Transolution is a Computer Aided Translation (CAT) suite supporting the XLIFF standard. It provides the open source community with features and concepts that have been used by commercial offerings for years to improve translation efficiency and quality. The suite is modular to make it flexible and provides an XLIFF Editor, translation memory engine and filters to convert different formats to and from XLIFF. The use of XLIFF means that almost any content can be localized as long as there is a filter for it (XML, SGML, PO, RTF, StarOffice/OpenOffice, etc.).
Pure PHP Spell Check performs spell-checking of text using only base PHP functions, without using specific spell check PHP extensions such as aspell or pspell. The class uses a dictionary that is implemented as an array-based binary search table. The binary search table declaration is saved to a file for speed and can be updated easily by the developer.
The Computational Linguistics Toolset is a set of tools for computational linguistics. It contains re-usable code for cleaning, splitting, refining, and taking samples from corpora (ICE, Penn, and a native one), for tagging them using the TnT-tagger, for doing permutation statistics on N-grams (useful for finding statistically significant syntactical differences between any two sets of tagged texts), and various examination-tools. The tools themselves are well documented.
IPA-CXS/X-Sampa Converter is a selection of modules for various programming languages (C, Perl, Lisp, and Python) for translating between IPA (International Phonetic Alphabet) and ASCII versions, in particular CXS, which is a close relative to X-Sampa. The project homepage contains a demo for using the Perl script as an online converter.
Xlit converts text from one writing system into another. It allows the user to define a transliteration simply by typing the input strings in one window and the strings to which they are to be mapped in another. Transliteration may be restricted to regions bounded by specified delimiters or their complements. Transliteration may also be performed by external commands or plugins. Xlit can also convert one type of delimiter to another, e.g. from HZ escapes to XML. Xlit can read and write transliteration definitions in its own format and as Yudit keymaps. It can be run in batch mode without the GUI.