RSS 1 project tagged "Parser"

Download Website Updated 12 Feb 2013 HtmlCleaner

Screenshot
Pop 26.65
Vit 9.86

HtmlCleaner is an HTML parser. HTML found on the Web is usually dirty, ill-formed, and unsuitable for further processing. For any serious consumption of such documents, it is necessary to first clean up the mess and bring order to the tags, attributes, and ordinary text. For a given HTML document, HtmlCleaner reorders individual elements and produces well-formed XML. By default, it follows rules similar to those which most Web browsers use to create a Document Object Model. However, the user may provide custom tag and rule sets for tag filtering and balancing.

Screenshot

Project Spotlight

UBY

A large-scale lexical resource for natural language processing.

Screenshot

Project Spotlight

GNOME Terminator

An application to use multiple GNOME terminals in one window.