contraccions validates word contractions in the Catalan language. Word contractions in Catalan spelling are very complex and the function enclosed in this package allows a spellchecker to warn about wrong expressions and to suggest something to correct them. It returns a code to indicate the type of the error and the location of the error in the expression, and allows a spellchecker to perform incremental validation.
The libmba package is a collection of mostly independent C modules potentially useful to any project. There are the usual ADTs including a linkedlist, hashmap, pool, stack, and varray, a flexible memory allocator, CSV parser, path canonicalization routine, I18N text abstraction, configuration file module, portable semaphores, condition variables, and more. The code is designed so that individual modules can be integrated into existing codebases rather than requiring the user to commit to the entire library. The code has no typedefs, few comments, and extensive man pages and HTML documentation.
Intlize allows the developer to use catgets without its normally bulky syntax and without having to mind the details of correct indices. Catgets is the built-in internationalization suite of C compilers. Alternatively, Intlize can produce a compact file format optimized for fast access. Intermediate files are in gettext po format, so there are many comfortable tools available to do the translations. Runtime files are provided for C and C++ and for both catgets and intlize native support.
libuninum is a library for converting Unicode strings to integers and integers to Unicode strings. Internal computation is done using arbitrary precision arithmetic, so there is no limit on the size of the integer that can be converted. Values are passed and returned as ASCII decimal strings, GNU MP mpz_t objects, or unsigned long integers. Auto-detection of the number system is provided. Very many number systems are supported. Group delimitation for output strings is fully controllable. Command line and graphical interfaces are also provided.
GTK2 Text Editor is a simple Unicode text editor that supports many encodings. It works as a nice Notepad replacement, and can also be used as an encoding converter. It supports multi-level undo, right to left text (as in Hebrew), and other Unicode features. It can auto-close an XML/HTML tag. There is no syntax highlighting yet.
pg_collkey is a wrapper to use the collation functions of the ICU library with a PostgreSQL database server. Using this wrapper, you can specify the desired locale for sorting UTF-8 strings directly in the SQL query, rather than setting it during database installation. Default Unicode collation (DUCET) is supported. You can select whether punctuation should be a primary collation attribute or not. The level of comparison can be limited (in order to ignore accents, for example). Numeric sequences of strings can be recognized, so that 'test2' sorts before 'test10'. This library is dependent on ICU.
RusXMMS provides character set conversion for languages which can be represented with more than one character set. It originally handled XMMS playlists, but can be useful for any program that works with small pieces of text in different languages and encodings. The library features language and encoding autodetection for most European languages, extensibility regarding new languages and encodings, recoding/translation of multi-language playlists, on-the-fly translation between languages using online services, and a GTK/GTK2 UI library.