uni2ascii and ascii2uni provide conversion in both directions between UTF-8 Unicode and more than thirty 7-bit ASCII equivalents, including RFC 2396 URI format and RFC 2045 Quoted Printable format, the representations used in HTML, SGML, XML, OOXML, the Unicode standard, Rich Text Format, POSIX portable charmaps, POSIX locale specifications, and Apache log files. It can also convert between the escapes used for Unicode in languages such as Ada, C, Common Lisp, Java, Pascal, Perl, Postscript, Python, Scheme, and Tcl.
Transolution is a Computer Aided Translation (CAT) suite supporting the XLIFF standard. It provides the open source community with features and concepts that have been used by commercial offerings for years to improve translation efficiency and quality. The suite is modular to make it flexible and provides an XLIFF Editor, translation memory engine and filters to convert different formats to and from XLIFF. The use of XLIFF means that almost any content can be localized as long as there is a filter for it (XML, SGML, PO, RTF, StarOffice/OpenOffice, etc.).
SILGraphite (formerly OpenGraphite) is a project within SIL's Non-Roman Script Initiative and Language Software Development groups to provide extensible cross-platform rendering capabilities for complex non-Roman writing systems. It consists of a rule-based programming language, Graphite Description Language (GDL), that can be used to describe the behavior of a writing system, a compiler for that language, and a rendering engine that can serve as the backend of a text processing application. SILGraphite renders TrueType fonts that have been extended by means of compiling a GDL program. It is currently being integrated into Gecko/Mozilla through the SILA project, a GNU/Linux port is also underway, and there are plans for OpenOffice.org and Abiword integration.
The Universal Text Recognizer and Converter (Utrac) is a commandline tool and a C library that recognizes the encoding of an input file (UTF-8, ISO-8859-1, CP437, etc.) and its end-of-line type (CR, LF, or CRLF). It features automatic recognition (depending on the file and on the system's locale, reliable in most cases), assistance for verification or manual recognition, and conversion to another charset and/or end-of-line type.
OmegaT is a translation memory application intended for professional translators. It does not translate for you (software that does this is called "machine translation"). It features fuzzy matching, match propagation, simultaneous processing of multiple-file projects, simultaneous use of multiple translation memories, and external glossaries. Document file formats include plain text, HTML, and OpenOffice.org/StarOffice. It has Unicode (UTF-8) support (can be used with non-Latin alphabets). It is compatible with other translation memory applications (TMX Level 1).
I18N is a class that gets translation texts from flat files or from an SQL database. The system supports variables in translated strings and has a conversion facility to move data from one container to another. An included tool checks programs against sets of translated strings to detect references without strings or unused strings. Each call checks that referenced variables exist.
Konjugator helps with learning or interpreting verb forms in Welsh. It produces a list of around 200,000 inflected verb forms for almost 4,000 Welsh verbs, along with English glosses and parsing information. It attempts to conjugate Welsh verbs that are unknown to it, and will give parsing details for random Welsh verb forms if these are known to it.