Sablotron is an XML toolkit which implements XSLT, DOM, and XPath. Sablotron is written in C++, and it can be used from C, Perl, Python, PHP, ObjectPascal, and via a command line interface. It supports the XSLT 1.0, XPath 1.0, and DOM Level 2 W3C specifications. It is designed to be as compact and portable as possible, and is maintained as an Open Source project by Ginger Alliance.
XIST is an extensible HTML and XML generator. It is also an XML parser with a very simple and Python-esque tree API. Every XML element type corresponds to a Python class, and these Python classes provide a conversion method to transform the XML tree (e.g. into HTML). XIST can be considered 'object-oriented XSLT'. XIST also includes a cross-platform templating language, Oracle utilities, and various other tools.
NetCrawler is the frontend to a Web crawling system. This command line application will download all of the pages within a domain, and then parse and process all of the relative content (Images, Text, Audio, Video), saving this content within an XML document for later processing. It is definitely alpha quality, but has been used quite extensively.
XPath Methods allows XPath queries on ParsedXML XML documents (and possibly other DOM implementations) in Zope. XPath is a relatively simple but still quite powerful query language used to address portions of XML documents. When you call an XPath Method you will retrieve a set of DOM nodes which you can then display in a Web page using DTML or ZPT, or which you can issue operations upon using, for instance, Python scripts.
xpath2rss is an XPath to RSS scraper. XPath makes a better HTML scraper than regex (the typical solution) because it understands the structure of the document, rather than just treating it as a big string. As a result, xpath2rss is a more reliable scraper, and much easier to use, once you get the hang of XPath.
Silva is a CMS for organizations that manage multiple or complex Web sites. Content is stored in clean XML, independent of layout and presentation. Features include versioning, a workflow system, an integral visual editor, content reuse, sophisticated access control, multi-site management, extensive import/export facilities, fine-grained templating, and hi-res image storage and manipulation. Silva is built on top of the Zope Web application platform.
EZ Reusable Objects (EZRO) is a Web application that can be used by non-technical staff to manage content as "objects." Content objects containing text, video, and audio can be shared, modified, and re-styled to appear via a traditional Web site, an on-line course, an innovative "Coach," or as a community of interest site. It is highly scalable and can be used for public Web sites, secure environments, and private intra/extranets.