RSS 193 projects tagged "Linguistic"

Download No website Updated 31 Mar 2004 jgram

Screenshot
Pop 18.97
Vit 1.00

jgram is a simple markov-chain library suitable for building rudimentary n-gram representations of sequences of any Java objects. It has an extensible scoring mechanism and a generator for traversing a random path through n-gram states based on transition score.

No download Website Updated 18 Apr 2004 Mind AI

Screenshot
Pop 47.12
Vit 1.00

The purpose of Mind AI is to build an artificial mind based on some advanced concepts: machine learning, representation and meta representation of concepts, concept reflection, reification (concept to meta concept), and denotation (meta concept to concept), and to explore some new concepts. Interaction with the AI is done via IRC.

Download Website Updated 25 May 2004 hiercat

Screenshot
Pop 26.94
Vit 1.00

Hiercat is an automatic text classifier that uses a keyword hierarchy to improve categorization. It uses the generative probabilistic model of Gaussier, et. al 2001 as its document model.

Download Website Updated 21 Jul 2004 Mobile i2e

Screenshot
Pop 22.72
Vit 1.00

Mobile i2e is a MIDP adaptation of the Linux translator "i2e". It supports i2e and idp dictionary file types. Support for new dictionary file types can be added with little effort.

Download Website Updated 26 Sep 2004 tspell

Screenshot
Pop 20.49
Vit 1.00

Tspell is a library and applications for solving Turkish Natural Language Processing (NLP) related computational problems. Turkish, by nature, has a very different morphological and grammatical structure than Indo-European languages such as English. Since it is an agglutinative language like Finnish, even making a simple spell checker is very challenging. Some target problems are: a spell checker, a word analyzer that determines roots and suffixes, a word constructor based on suffixes, and much more.

No download Website Updated 30 Jan 2005 libtranslate

Screenshot
Pop 21.45
Vit 1.00

libtranslate is a library for translating text and Web pages between natural languages. Its modular infrastructure allows the user to implement new translation services separately from the core library. libtranslate is shipped with a generic module that supports Web-based translation services such as Babel Fish, Google Language Tools, and SYSTRAN. Moreover, the generic module allows new services to be added simply by adding a few lines to an XML file. The libtranslate distribution includes a powerful command line interface.

Download Website Updated 30 Jan 2005 GNOME Translate

Screenshot
Pop 24.86
Vit 1.00

GNOME Translate is a GNOME interface to libtranslate. It can translate a text or Web page between several natural languages, and it can automatically detect the source language as you type in text.

Download Website Updated 18 Mar 2005 Universal Text Recognizer and Converter

Screenshot
Pop 41.83
Vit 1.00

The Universal Text Recognizer and Converter (Utrac) is a commandline tool and a C library that recognizes the encoding of an input file (UTF-8, ISO-8859-1, CP437, etc.) and its end-of-line type (CR, LF, or CRLF). It features automatic recognition (depending on the file and on the system's locale, reliable in most cases), assistance for verification or manual recognition, and conversion to another charset and/or end-of-line type.

No download No website Updated 07 Apr 2005 Lost in Translation

Screenshot
Pop 16.00
Vit 1.00

Lost in Translation is a steganographic encoder that exploits the possibilities of steganographically embedding information in the "noise'' created by automatic translation of natural language documents. Because natural language translation inherently creates plenty of room for variation, it is ideal for steganographic applications. Also, because there are frequent errors in legitimate automatic text translations, additional errors inserted by an information hiding mechanism are plausibly undetectable and would appear to be part of the normal noise associated with translation.

Download Website Updated 12 May 2005 Tomabaem

Screenshot
Pop 13.53
Vit 1.00

Tomabaem is a substitute for the System's Character Palette, at least for people focusing on the so-called CJKV languages (Chinese, Japanese, Korean, and Vietnamese). Tomabaem, like Unicode, is cross-language. Whatever you are looking for related to Chinese characters, there's a high chance that Tomabaem has a way of looking it up, whether it's the Cantonese pronunciation, the UTF-16 codepoint, the radical, the meaning, or the character itself, which you can copy/paste or drag'n'drop from another document. It uses UniHan.txt file from the Unicode Consortium as the basis of the data shown.

Screenshot

Project Spotlight

FusionDirectory

An infrastructure manager.

Screenshot

Project Spotlight

BalanceNG

A modern software IP load balancer.