ICU provides a Unicode implementation, with functions for formatting numbers, dates, times, and currencies (according to locale conventions, transliteration, and parsing text in those formats). It provides flexible patterns for formatting messages, where the pattern determines the order of the variable parts of the messages, and the format for each of those variables. These patterns can be stored in resource files for translation to different languages. Included are more than 100 codepage converters for interaction with non-unicode systems.
GNU Source-highlight produces a document with syntax highlighting when given a source file. It handles many languages, e.g., Java, C/C++, Prolog, Perl, PHP3, Python, Flex, HTML, and other formats, e.g., ChangeLog and log files, as source languages and HTML, XHTML, DocBook, ANSI color escapes, LaTeX, and Texinfo as output formats. Input and output formats can be specified with a regular expression-oriented syntax.
Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite are supported.
ClearParse is a flexible engine that can be used for any parsing task including interpreting or compiling programming languages, analyzing or converting data files, processing command line parameters and user input, implementing markup languages and scripts, natural language processing (NLP), and more.
The SODA Native XML Database System is a native XML database that provides efficient management of large amounts of XML data. It is based on a multi-user, client-server architecture with a generic query processing layer that can easily support different query languages. In this lightweight version, user- defined indexes and query optimizations have been removed, however full transaction support (commits and rollbacks) and crash recovery are available.
Passepartout is a GTK-based Desktop Publishing application. It features layout templates, an XML- based typesetting engine called xml2ps, user- defined text formatting with XSLT stylesheets, support for importing EPS (Encapsulated PostScript) files and almost any raster images, text running around image (or text) frames, and printing to PostScript and EPS.
Colorer Library provides source text syntax highlighting and text parsing services for host applications. It colorizes source code on host editor systems in more than 100 formats. It uses the powerful HRC format (XML, regexp, context-free grammars), allowing it to support any language. The parser can search and build lists of special text tokens (function lists, syntax errors) and search and indent programming language constructions (brackets, paired tags).
CodeWorker is a versatile parsing tool and a universal source code generator. It interprets a scripting language for producing reusable, tailor-made, evolving, and reliable IT systems with a high level of automation. The file formats to parse are described in an extended-BNF syntax. Template-based scripts drive the writing of patterns for generating code or text. The code generation knows how to preserve protected areas with hand-typed code and provides code expansion, source-to-source translation, and program transformation. It provides a native translation of CodeWorker's scripts in C++.