The purpose of Mind AI is to build an artificial mind based on some advanced concepts: machine learning, representation and meta representation of concepts, concept reflection, reification (concept to meta concept), and denotation (meta concept to concept), and to explore some new concepts. Interaction with the AI is done via IRC.
Tspell is a library and applications for solving Turkish Natural Language Processing (NLP) related computational problems. Turkish, by nature, has a very different morphological and grammatical structure than Indo-European languages such as English. Since it is an agglutinative language like Finnish, even making a simple spell checker is very challenging. Some target problems are: a spell checker, a word analyzer that determines roots and suffixes, a word constructor based on suffixes, and much more.
libtranslate is a library for translating text and Web pages between natural languages. Its modular infrastructure allows the user to implement new translation services separately from the core library. libtranslate is shipped with a generic module that supports Web-based translation services such as Babel Fish, Google Language Tools, and SYSTRAN. Moreover, the generic module allows new services to be added simply by adding a few lines to an XML file. The libtranslate distribution includes a powerful command line interface.
The Universal Text Recognizer and Converter (Utrac) is a commandline tool and a C library that recognizes the encoding of an input file (UTF-8, ISO-8859-1, CP437, etc.) and its end-of-line type (CR, LF, or CRLF). It features automatic recognition (depending on the file and on the system's locale, reliable in most cases), assistance for verification or manual recognition, and conversion to another charset and/or end-of-line type.
Lost in Translation is a steganographic encoder that exploits the possibilities of steganographically embedding information in the "noise'' created by automatic translation of natural language documents. Because natural language translation inherently creates plenty of room for variation, it is ideal for steganographic applications. Also, because there are frequent errors in legitimate automatic text translations, additional errors inserted by an information hiding mechanism are plausibly undetectable and would appear to be part of the normal noise associated with translation.
Tomabaem is a substitute for the System's Character Palette, at least for people focusing on the so-called CJKV languages (Chinese, Japanese, Korean, and Vietnamese). Tomabaem, like Unicode, is cross-language. Whatever you are looking for related to Chinese characters, there's a high chance that Tomabaem has a way of looking it up, whether it's the Cantonese pronunciation, the UTF-16 codepoint, the radical, the meaning, or the character itself, which you can copy/paste or drag'n'drop from another document. It uses UniHan.txt file from the Unicode Consortium as the basis of the data shown.