Libxml2 is the XML C library developed for the Gnome project. The library code is portable (to Linux, Unix, Windows, embedded systems, etc.) and modular; most of the extensions can be compiled out. Libxml2 implements a number of existing standards related to markup languages, including the XML standard, Namespaces in XML, XML Base, Relax NG, RFC 2396, XPath, XPointer, HTML4, XInclude, SGML Catalogs, and XML Catalogs. In most cases, libxml tries to implement the specifications in a relatively strict way. To some extent, it provides support for the following specifications, but doesn't claim to implement them: DOM, FTP client, HTTP client, and SAX2. Support for W3C XML Schemas is in progress. It includes xmllint, a command line XML validator.
Sablotron is an XML toolkit which implements XSLT, DOM, and XPath. Sablotron is written in C++, and it can be used from C, Perl, Python, PHP, ObjectPascal, and via a command line interface. It supports the XSLT 1.0, XPath 1.0, and DOM Level 2 W3C specifications. It is designed to be as compact and portable as possible, and is maintained as an Open Source project by Ginger Alliance.
Libxslt is a C library for GNOME which allows developers to work with XSLT. It is based on libxml for XML parsing, tree manipulation, and XPath support. Also included is 'xsltproc', a command line XSLT processor. The library is written in plain C, making as few assumptions as possible, and sticking closely to ANSI C/POSIX for easy embedding. It should work on Linux, Unix, and Windows. Though not designed primarily with performances in mind, libxslt seems to be a relatively fast processor. It also include full support for the EXSLT set of extension functions as well as some common extensions present in other XSLT engines.
XIST is an extensible HTML and XML generator. It is also an XML parser with a very simple and Python-esque tree API. Every XML element type corresponds to a Python class, and these Python classes provide a conversion method to transform the XML tree (e.g. into HTML). XIST can be considered 'object-oriented XSLT'. XIST also includes a cross-platform templating language, Oracle utilities, and various other tools.
Xmldiff is a Python tool that finds the differences between two similar XML files in the same way the diff utility does for text files. A description of the changes found can be displayed using Xmldiff's syntax or as an XUpdate script that can be used to "patch" the original document.
NetCrawler is the frontend to a Web crawling system. This command line application will download all of the pages within a domain, and then parse and process all of the relative content (Images, Text, Audio, Video), saving this content within an XML document for later processing. It is definitely alpha quality, but has been used quite extensively.
XPath Methods allows XPath queries on ParsedXML XML documents (and possibly other DOM implementations) in Zope. XPath is a relatively simple but still quite powerful query language used to address portions of XML documents. When you call an XPath Method you will retrieve a set of DOM nodes which you can then display in a Web page using DTML or ZPT, or which you can issue operations upon using, for instance, Python scripts.
xpath2rss is an XPath to RSS scraper. XPath makes a better HTML scraper than regex (the typical solution) because it understands the structure of the document, rather than just treating it as a big string. As a result, xpath2rss is a more reliable scraper, and much easier to use, once you get the hang of XPath.