XMLDB uses an RDBMS to persist arbitrary XML documents. Due to its storage mechanism, searching for and recalling documents is extremely quick. You can also perform XSL translation on documents with surprising speed. The library can be used in any program to store libxml2 documents. A PHP module is also included, making XMLDB into a complete three-tier Web application development suite.
glark offers grep-like searching of text files, with very powerful, complex regular expressions (e.g., "/foo\w+/ and /bar[^\d]*baz$/ within 4 lines of each other"). It also highlights the matches, displays context (preceding and succeeding lines), does case-insensitive matches, and automatic exclusion of non-text files. It supports most options from the GNU version of grep.
The Multivalent PDF Tools is a suite of tools for manipulating PDF documents. It includes tools for compressing, uncompressing (for hand editing), obtaining metadata, splitting and merging, encrypting and decrypting, validating, imposition (aka n-up), making page images, extracting text, and full-text indexing (with Lucene). The compress tool shrinks the PDF 1.5 Reference from 13.5MB to 8MB in PDF 1.5/Acrobat 6 format and down to 5.1MB in a new proposed "Compact" format.
The Aardvark Shell Utils is a collection of three utilities designed to aid the user when working with shell scripts or from the command line. All three accept input on the command line or from standard input, and thus they can be piped with other commands. All commands come with their own man page. Included are realpath, filebase, and fileext.
PyBison is a sophisticated yet easy-to-use parser creation toolkit for Python that interfaces directly to Bison (yacc)-based parsers. It provides full LALR(1) grammar support, allowing for simple parsing tasks through to writing compilers for high-level languages. Parser code is automatically generated from rules within user-created Parser classes (written in Python), and then, compiled, yacc'ed and linked into a shared library, which is loaded into the running process. All this happens automatically. When the parser runs, it connects directly with the yyparse() routine, and takes event callbacks upon parse targets being reached.