libunibreak is an implementation of the line breaking and word breaking algorithms as described in Unicode Standard Annex 14 and Unicode Standard Annex 29. It is a superset of, and supersedes, liblinebreak. It is designed to be used in a generic text renderer. FBReader is one real-world example.
libgxim is an X Input Method (a.k.a. XIM) protocol library that is implemented by GObject. This library helps you implement XIM servers or client applications to communicate through the XIM protocol without using the Xlib API directly, particularly if your application uses a GObject-based main loop.
Intlize allows the developer to use catgets without its normally bulky syntax and without having to mind the details of correct indices. Catgets is the built-in internationalization suite of C compilers. Alternatively, Intlize can produce a compact file format optimized for fast access. Intermediate files are in gettext po format, so there are many comfortable tools available to do the translations. Runtime files are provided for C and C++ and for both catgets and intlize native support.
libmime is a MIME parser in the same vein as Expat, the stream-oriented XML parser. As input is fed to the parser, events are generated which an application can catch by registering event handlers. Such events include the Unix From_ line, start of entity, end of entity, entity boundary, header, end of headers, and body. libmime supports MIME message editing through a delta mechanism. Edit contexts are instantiated and changes applied to specific contexts. Edit contexts can then be expressed in standard unified diff format which, when applied to the input source stream, will result in the new message.
RusXMMS provides character set conversion for languages which can be represented with more than one character set. It originally handled XMMS playlists, but can be useful for any program that works with small pieces of text in different languages and encodings. The library features language and encoding autodetection for most European languages, extensibility regarding new languages and encodings, recoding/translation of multi-language playlists, on-the-fly translation between languages using online services, and a GTK/GTK2 UI library.
The Universal Text Recognizer and Converter (Utrac) is a commandline tool and a C library that recognizes the encoding of an input file (UTF-8, ISO-8859-1, CP437, etc.) and its end-of-line type (CR, LF, or CRLF). It features automatic recognition (depending on the file and on the system's locale, reliable in most cases), assistance for verification or manual recognition, and conversion to another charset and/or end-of-line type.
uni2ascii and ascii2uni provide conversion in both directions between UTF-8 Unicode and more than thirty 7-bit ASCII equivalents, including RFC 2396 URI format and RFC 2045 Quoted Printable format, the representations used in HTML, SGML, XML, OOXML, the Unicode standard, Rich Text Format, POSIX portable charmaps, POSIX locale specifications, and Apache log files. It can also convert between the escapes used for Unicode in languages such as Ada, C, Common Lisp, Java, Pascal, Perl, Postscript, Python, Scheme, and Tcl.