DOMC is a lightweight implementation of the Document Object Model (DOM) in ANSI C as specified in the W3C DOM Core Level 1 recommendation. When coupled with the Expat XML Parser Toolkit, DOMC can load, store, build, and directly manipulate XML documents represented as a tree in memory.
Boustrophedon Text Reader displays text files in Boustrophedon, a writing style used by the ancient Greeks that alternates direction every line. Once accustomed to the style, a person can gain significant speed in reading and writing. It is of particular interest to those seeking ambidexterity and brain-hemisphere equivalence.
The config_cfe Perl module contains functions that ease updating small text files, usually configuration files. It is easy to write small scripts that can do editing in 'batch mode', a necessary thing when updating many hosts at the same time. There is a simple program, update-cfe, that can be used in shell scripts (sometimes update-cfe is enough to update a section in a file).
Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite are supported.
Youhp3 (Youpee's One Unlimited HTML PreProcessor) is an HTML preprocessor that allows you to embed code of any script language, as well as calling any external program to generate text files. It is specifically designed to work with HTML/XML documents, and provide traditional features, such as define, include, macro, conditional tests, and loop.
ClearParse is a flexible engine that can be used for any parsing task including interpreting or compiling programming languages, analyzing or converting data files, processing command line parameters and user input, implementing markup languages and scripts, natural language processing (NLP), and more.
The libmba package is a collection of mostly independent C modules potentially useful to any project. There are the usual ADTs including a linkedlist, hashmap, pool, stack, and varray, a flexible memory allocator, CSV parser, path canonicalization routine, I18N text abstraction, configuration file module, portable semaphores, condition variables, and more. The code is designed so that individual modules can be integrated into existing codebases rather than requiring the user to commit to the entire library. The code has no typedefs, few comments, and extensive man pages and HTML documentation.
glark offers grep-like searching of text files, with very powerful, complex regular expressions (e.g., "/foo\w+/ and /bar[^\d]*baz$/ within 4 lines of each other"). It also highlights the matches, displays context (preceding and succeeding lines), does case-insensitive matches, and automatic exclusion of non-text files. It supports most options from the GNU version of grep.