Projects / xpath2rss

xpath2rss

xpath2rss is an XPath to RSS scraper. XPath makes a better HTML scraper than regex (the typical solution) because it understands the structure of the document, rather than just treating it as a big string. As a result, xpath2rss is a more reliable scraper, and much easier to use, once you get the hang of XPath.

Tags
Operating Systems
Implementation

RSS Recent releases

  •  17 May 2002 16:07

Release Notes: This release has support for HTML and XHTML (see README for caveats), has arbitrary channel metadata and item attributes, checks for a link tag pointing to available feed(s) when scraping, checks robots.txt to be polite, has a new config file format (hopefully the last time it will change), includes an XSLT stylesheet to convert config files, has changed to RSS.py for RSS generation, and has general code cleanup.

  •  31 Jan 2002 02:28

Release Notes: This release changes xml.sax.writer to the spiffier xml.sax.saxutils.XMLGenerator, and fixes more encoding errors.

Screenshot

Project Spotlight

NolaPro

A Web-based accounting, inventory, POS, and business management suite.

Screenshot

Project Spotlight

Eric

A Python IDE written using PyQt and QScintilla.