The Heirloom Toolchest is a collection of standard Unix utilities. It was derived from original Unix material released as open source by Caldera and Sun, and contains multiple versions of each utility corresponding to SVID3/SVR4, SVID4/SVR4.2MP, POSIX.2-1992/SUSV2, POSIX.1-2001/SUSV3, and 4BSD (SVR4 /usr/ucb). It processes lines of arbitrary length and in many cases binary input data, supports characters in UTF-8 and many East Asian encodings, and contains more than 100 individual utilities including bc, cpio, diff, ed, file, find, grep, man, nawk, oawk, pax, ps, sed, sort, spell, and tar. Extensive documentation is included.
Intlize allows the developer to use catgets without its normally bulky syntax and without having to mind the details of correct indices. Catgets is the built-in internationalization suite of C compilers. Alternatively, Intlize can produce a compact file format optimized for fast access. Intermediate files are in gettext po format, so there are many comfortable tools available to do the translations. Runtime files are provided for C and C++ and for both catgets and intlize native support.
RusXMMS provides character set conversion for languages which can be represented with more than one character set. It originally handled XMMS playlists, but can be useful for any program that works with small pieces of text in different languages and encodings. The library features language and encoding autodetection for most European languages, extensibility regarding new languages and encodings, recoding/translation of multi-language playlists, on-the-fly translation between languages using online services, and a GTK/GTK2 UI library.
Traditional vi is derived from the 4BSD source and includes support for modern operating systems, 8-bit input, multi-byte character encodings like UTF-8, and features demanded by POSIX.2. It contains few additions beyond this, so it is of interest for those who look for a small but well-defined vi implementation close to that of most commercial Unix systems. It also has some features to cope with primitive terminals or slow connections.
The Universal Text Recognizer and Converter (Utrac) is a commandline tool and a C library that recognizes the encoding of an input file (UTF-8, ISO-8859-1, CP437, etc.) and its end-of-line type (CR, LF, or CRLF). It features automatic recognition (depending on the file and on the system's locale, reliable in most cases), assistance for verification or manual recognition, and conversion to another charset and/or end-of-line type.
cstrings is a lightweight internationalization tool for C code. It is useful for those who find gettext too bulky and intrusive. It extracts strings from a program, and turns them into #defines in a prepended code section. It has good features for building up your message base incrementally.