Projects / xpath2rss


xpath2rss is an XPath to RSS scraper. XPath makes a better HTML scraper than regex (the typical solution) because it understands the structure of the document, rather than just treating it as a big string. As a result, xpath2rss is a more reliable scraper, and much easier to use, once you get the hang of XPath.

Operating Systems

Recent releases

  •  17 May 2002 20:07

    Release Notes: This release has support for HTML and XHTML (see README for caveats), has arbitrary channel metadata and item attributes, checks for a link tag pointing to available feed(s) when scraping, checks robots.txt to be polite, has a new config file format (hopefully the last time it will change), includes an XSLT stylesheet to convert config files, has changed to for RSS generation, and has general code cleanup.

    •  31 Jan 2002 07:28

      Release Notes: This release changes xml.sax.writer to the spiffier xml.sax.saxutils.XMLGenerator, and fixes more encoding errors.


      Project Spotlight


      A Fluent OpenStack client API for Java.


      Project Spotlight

      TurnKey TWiki Appliance

      A TWiki appliance that is easy to use and lightweight.