Babeldoc is a framework and set of applications to process documents for business-to-business and other Internet/integration applications. It is primarily intended for text documents, especially XML, but supports a wide range of operations and data types. It has a sophisticated journaling system that supports replaying and reprocessing. Babeldoc is pipeline based and supports numerous ways to combine the pipeline stages in a dynamically reconfigurable fashion. It has a GUI and a Web-based console for document processing and monitoring, and comes with tools for the tranformation of flatfile data to XML, archival, and cryptography. Additionally it is able to scan various data sources based on sophisticated constraints.
DOMC is a lightweight implementation of the Document Object Model (DOM) in ANSI C as specified in the W3C DOM Core Level 1 recommendation. When coupled with the Expat XML Parser Toolkit, DOMC can load, store, build, and directly manipulate XML documents represented as a tree in memory.
GXPARSE is not a new XML parser, but is an additional processing layer that makes it much easier to use event-based parsers like the SAX parser. It supports both direct sequential output and random access output (via the Resequencer interface). The random access mode delays output until all input has been processed, but makes it much easier to handle ID/IDREF attributes. GXPARSE maintains most advantages of the event-based parser. Application development and maintenance is considerably easier, but processing is a little slower.
OOoPy is a Python library for modifying OpenOffice.org documents. It provides a set of transformations on the OOo XML format using the ElementTree XML Library. Transformations included are a mail merge application and the concatenation of documents with formatting intact. The framework supports easy creation of new transformations.
The SODA Native XML Database System is a native XML database that provides efficient management of large amounts of XML data. It is based on a multi-user, client-server architecture with a generic query processing layer that can easily support different query languages. In this lightweight version, user- defined indexes and query optimizations have been removed, however full transaction support (commits and rollbacks) and crash recovery are available.
SiSU (Structured information, Serialized Units) is a lightweight markup based, text structuring and publishing framework (that features granular search). With minimal markup of a plaintext file, it produces: plain-text, HTML, XHTML, XML, ODF, LaTeX, PDF, and populates an SQL database at an object/paragraph level for granular searches. Prepare documents using your text editor of choice, then use SiSU to generate the desired output formats. SiSU is controlled from the command line.