193 projects tagged "Linguistic"

Download Website Updated 19 Sep 2004 HumAn Language GENerator

Screenshot
Pop 60.08
Vit 1.51

HALoGEN is an extremely powerful and easy to use general-purpose natural language generation system. It consists of a symbolic generator, a forest ranker, and some sample inputs. The symbolic generator includes the Sensus Ontology dictionary based on WordNet. The forest ranker includes a 250 million word ngram language model (unigram, bigram, and trigram) trained on the Wall Street Journal newspaper text. The symbolic generator is written in LISP and requires a Lisp interpreter.

Download Website Updated 22 Jul 2002 Linguaphile

Screenshot
Pop 65.45
Vit 1.49

Linguaphile is a simple command line language translator. It is open source, platform independent, and programmed in Perl. Linguaphile currently supports the following languages: Afrikaans, Alawa, Albanian, Arrernte, Basque, Belarusian, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Finnish, French, Galician, German, Greek, Hawaiian, Hungarian, Icelandic, Indonesian, Interlingua, Irish, Italian, Kala Lagaw Ya, Korean, Kriol, Latvian, Lithuanian, Malay, Maltese, Maori, Norwegian, Pitjantjatjara, Polish, Portuguese, Romanian, Russian, Samoan, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Thai, Tok Pisin, Turkish, Ukrainian, Warlpiri, and Welsh. The Spanish to English translation is the most useful at this stage.

No download Website Updated 05 Dec 2003 libTextCat

Screenshot
Pop 46.15
Vit 1.45

Libtextcat is a library with functions that implement the classification technique described in Cavnar & Trenkle, "N-Gram-Based Text Categorization". It was primarily developed for language guessing, a task on which it is known to perform with near- perfect accuracy. Considerable effort went into making this implementation fast and efficient. The language guesser processes over 100 documents/second on a simple PC, which makes it practical for many uses.

Download Website Updated 10 Dec 2003 JavaBot for AIM

Screenshot
Pop 26.83
Vit 1.45

JavaBot is a chat bot for the AOL Instant Messenger, MSN Messenger, and Yahoo! messenger systems. It is designed to attempt to engage in conversation with people via IM. It supports such features as remote administration and full conversation logging. There is also a mechanism for eavesdropping on conversations other people are having with the bot. All administration is handled over IM, allowing remote administration. Using a simple scripting language, the bot's conversational responses can be modified to taste.

Download Website Updated 01 Dec 2001 euc2html

Screenshot
Pop 18.17
Vit 1.44

euc2html is a simple application to convert any double-byte Japanese (and maybe Chinese/Korean) EUC-encoded characters to HTML/4.0 Unicode entities. It operates using stdin/stdout only, so is useful for batch updating Web sites, content, etc.

No download Website Updated 24 May 2003 Biaroza

Screenshot
Pop 26.76
Vit 1.44

Biaroza is a multi-dictionary system for human languages which aims to set a standard on such type of software. It works internally (and externally if you want so) in UTF-8. The software itself supports querying by particles, customizable in/out filtering, and interface mode (for using with another software) among other features.

Download Website Updated 05 Mar 2004 MegaLettering

Screenshot
Pop 17.18
Vit 1.44

MegaLettering is the PHP engine created to manage the Italian translation of www.megatokyo.com, but it is written with general use in mind, so it can support any number of languages. Text in baloons can be translated by using a MySQL database that defines both the balloon shapes and the translated text and fonts to use to add new text.

Download Website Updated 03 Nov 2002 Marko

Screenshot
Pop 30.05
Vit 1.42

Marko is a simple toolset that allows you to create markov chain databases of a corpus (or two) of text and then allows you to compare unknown texts to these databases. For any two marko databases you can calculate the probability that the unknown body is related to one over the other. Possible applications include intelligent mail filtering, plagiarism detection, and historical research.

Download Website Updated 23 Apr 2003 BabelKit

Screenshot
Pop 29.34
Vit 1.42

BabelKit is an interface to a universal multilingual database code table. It takes all of the programming work out of maintaining multiple database code definition sets in multiple languages. The code administration and translation page lets developers define new virtual code tables, new languages, enter all codes and their descriptions, and then translate them into all languages of interest. Perl and PHP classes retrieve the code descriptions and automatically generate HTML code selection elements in the user's language. This makes internationalization and localization of Web sites and database interfaces much easier.

Download Website Updated 09 Nov 2004 freli

Screenshot
Pop 30.00
Vit 1.42

FRELI (the Free Repository of English Lexical Information) is a freely redistributable list of English words with associated information (parts of speech, alternate spellings, etc.).

Screenshot

Project Spotlight

CT-gui/CT-synth/CT-farfisa

A GUI toolkit for Linux and Android.

Screenshot

Project Spotlight

fcron

A command scheduler for non-permanently-running systems.