RSS 16 projects tagged "Linguistic"

No download Website Updated 10 Sep 2005 I18N

Screenshot
Pop 12.85
Vit 52.99

I18N is a class that gets translation texts from flat files or from an SQL database. The system supports variables in translated strings and has a conversion facility to move data from one container to another. An included tool checks programs against sets of translated strings to detect references without strings or unused strings. Each call checks that referenced variables exist.

Download Website Updated 26 Apr 2013 queXC

Screenshot
Pop 109.44
Vit 16.29

queXC is a Web-based data cleaning and coding/classification system that takes a data file (such as data collected from a questionnaire) and cleans the text input fields by spacing them and spell checking them. It allows operators to code text fields to existing coding schemes, or to create a coding scheme on the fly. Multiple operators can code and clean simultaneously, with the ability to assign operators to do particular codes. The queXC system includes some coding schemes created from ABS (Australian Bureau of Statistics) data. It can be used as an open source replacement for Nvivo in some situations.

Download Website Updated 06 Apr 2010 Glossword

Screenshot
Pop 151.25
Vit 7.88

Glossword is a system to publish dictionaries, glossaries, and encyclopedias. It features an installation wizard, support for multiple languages, visual themes, multi-domain installation, an administrative interface with multi-user support, built-in search and cache engines, the ability to export/import dictionaries in XML format, and W3C-validated code. Glossword is useful for any sort of dictionary-like content, including sites with game cheat codes, online translators, references, and various kinds of CMS solutions.

No download Website Updated 22 Dec 2008 Open Translation Engine

Screenshot
Pop 84.10
Vit 5.83

Open Translation Engine (OTE) is a Web-based system to enable community management of translation dictionaries.

No download Website Updated 19 Jan 2009 Unicode.php

Screenshot
Pop 17.38
Vit 3.23

The CentralNic Unicode Library (Unicode.php) provides some PHP classes for manipulating Unicode data. These classes are general purpose, but are intended for use when working with Internationalised Domain Names (IDNs).

Download Website Updated 05 Mar 2004 MegaLettering

Screenshot
Pop 17.94
Vit 1.44

MegaLettering is the PHP engine created to manage the Italian translation of www.megatokyo.com, but it is written with general use in mind, so it can support any number of languages. Text in baloons can be translated by using a MySQL database that defines both the balloon shapes and the translated text and fonts to use to add new text.

Download Website Updated 23 Apr 2003 BabelKit

Screenshot
Pop 30.55
Vit 1.42

BabelKit is an interface to a universal multilingual database code table. It takes all of the programming work out of maintaining multiple database code definition sets in multiple languages. The code administration and translation page lets developers define new virtual code tables, new languages, enter all codes and their descriptions, and then translate them into all languages of interest. Perl and PHP classes retrieve the code descriptions and automatically generate HTML code selection elements in the user's language. This makes internationalization and localization of Web sites and database interfaces much easier.

Download Website Updated 13 Oct 2006 Uplug

Screenshot
Pop 22.72
Vit 1.08

Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora. Several tools have been integrated in Uplug. Pre-processing tools include a sentence splitter, tokenizer, and external part-of-speech tagger and shallow parsers. The following external tools are used: the Grok system for English (tagging and chunking) and the morphological analyzer ChaSen for Japanese. Other tools such as the TreeTagger can easily be added. Translated documents can be sentence aligned using the length-based approach by Gale & Church. Words and phrases can be aligned using the clue alignment approach and the toolbox for training statistical alignment models GIZA++.

Download Website Updated 28 Dec 2003 Convert character set

Screenshot
Pop 22.00
Vit 1.00

Convert character set is meant to convert text strings between different character set encodings. It features conversion between single byte character sets, from single byte to multi-byte character sets (UTF-8), and from multi-byte to single byte. All conversion output can be saved with numeric entities (browser character set independent). The main requirement is that a character has to be in both character sets, or it will return an error.

No download No website Updated 07 Apr 2005 Lost in Translation

Screenshot
Pop 16.37
Vit 1.00

Lost in Translation is a steganographic encoder that exploits the possibilities of steganographically embedding information in the "noise'' created by automatic translation of natural language documents. Because natural language translation inherently creates plenty of room for variation, it is ideal for steganographic applications. Also, because there are frequent errors in legitimate automatic text translations, additional errors inserted by an information hiding mechanism are plausibly undetectable and would appear to be part of the normal noise associated with translation.

Screenshot

Project Spotlight

astGUIclient

Software which extends the functions of Asterisk with end-user Web clients.

Screenshot

Project Spotlight

YourKit Java Profiler

A CPU and memory Java profiler.