RSS 2 projects tagged "HTML cleaner"

No download No website Updated 11 Dec 2010 AlchemyAPI Android SDK

Screenshot
Pop 81.50
Vit 35.01

The AlchemyAPI Android SDK enables real-time semantic analysis of text, HTML, or Internet-hosted Web page content. The SDK provides mechanisms to extract Concepts, Named Entities, Keywords and Tags, Categories, and clean HTML into text, and even detects languages. It can analyze text in eight different languages: English, French, German, Italian, Portuguese, Russian, Spanish, and Swedish. Example code and a demo application are included to help get you started.

Download Website Updated 12 Feb 2013 HtmlCleaner

Screenshot
Pop 16.85
Vit 20.74

HtmlCleaner is an HTML parser. HTML found on the Web is usually dirty, ill-formed, and unsuitable for further processing. For any serious consumption of such documents, it is necessary to first clean up the mess and bring order to the tags, attributes, and ordinary text. For a given HTML document, HtmlCleaner reorders individual elements and produces well-formed XML. By default, it follows rules similar to those which most Web browsers use to create a Document Object Model. However, the user may provide custom tag and rule sets for tag filtering and balancing.

Screenshot

Project Spotlight

DataNucleus AccessPlatform

Standards-compliant Java persistence via JDO/JPA/REST and RDBMS/MongoDB/Neo4j/Excel/LDAP.

Screenshot

Project Spotlight

InterceptNate

A Hibernate session and transaction manager.