Projects / Jericho HTML Parser

Jericho HTML Parser

Jericho HTML Parser is a Java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognized or invalid HTML. It also provides high-level HTML form manipulation functions.

Tags
Licenses
Operating Systems
Implementation

RSS Recent releases

  •  05 Mar 2011 08:18

    Release Notes: This version includes important bugfixes and various enhancements including HTML5 support.

    •  11 Jun 2009 12:09

      Release Notes: Important bugfixes and a new stream-based parsing option allowing memory efficient processing of large files.

      •  10 Apr 2009 10:06

        Release Notes: This version is a major new release that requires the Java 5 runtime or later. It introduces major API changes such as generics and enums, as well as some new features.

        •  25 Jun 2008 06:56

        Release Notes: This version includes important bugfixes and the following enhancements. Non-server tags are no longer recognized inside server tags. Microsoft downlevel-revealed conditional comments are recognized. All unnecessary white space may be removed from a source document. Various other enhancements were made to existing features.

        •  02 Sep 2007 05:41

        Release Notes: This version includes important bugfixes and introduces the following minor enhancements: elements inside SCRIPT elements are ignored. Encoding detection and analysis were improved. Parsing of attributes containing server tags was improved.

        RSS Recent comments

        09 Mar 2010 18:37 marcu Thumbs up

        it just works.

        Screenshot

        Project Spotlight

        Socat

        A relay for bidirectional data transfer.

        Screenshot

        Project Spotlight

        scikits.statsmodels

        Statistical computations and models for use with SciPy.