Projects / xpath2rss

xpath2rss

xpath2rss is an XPath to RSS scraper. XPath makes a better HTML scraper than regex (the typical solution) because it understands the structure of the document, rather than just treating it as a big string. As a result, xpath2rss is a more reliable scraper, and much easier to use, once you get the hang of XPath.

Tags
Operating Systems
Implementation

Recent releases

  •  17 May 2002 20:07

    Release Notes: This release has support for HTML and XHTML (see README for caveats), has arbitrary channel metadata and item attributes, checks for a link tag pointing to available feed(s) when scraping, checks robots.txt to be polite, has a new config file format (hopefully the last time it will change), includes an XSLT stylesheet to convert config files, has changed to RSS.py for RSS generation, and has general code cleanup.

    •  31 Jan 2002 07:28

      Release Notes: This release changes xml.sax.writer to the spiffier xml.sax.saxutils.XMLGenerator, and fixes more encoding errors.

      Screenshot

      Project Spotlight

      OpenStack4j

      A Fluent OpenStack client API for Java.

      Screenshot

      Project Spotlight

      TurnKey TWiki Appliance

      A TWiki appliance that is easy to use and lightweight.