RSS 8 projects tagged "Parser"

Download Website Updated 11 Nov 2013 jsoup

Screenshot
Pop 188.71
Vit 14.18

jsoup is a Java library for working with real-world HTML. It can parse HTML from a URL, file, or string. It can find and extract data, using DOM traversal or CSS selectors. The HTML elements, attributes, and text can be manipulated. It can clean user-submitted content against a safe white-list. jsoup is designed to deal with all varieties of HTML found in the wild, from pristine and validating to invalid tag-soup; jsoup will create a sensible parse tree.

Download Website Updated 29 Nov 2011 Apache OpenNLP

Screenshot
Pop 85.28
Vit 1.49

Apache OpenNLP is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services.

Download No website Updated 14 Jan 2014 gradle-sablecc-plugin

Screenshot
Pop 59.67
Vit 1.00

gradle-sablecc-plugin is a gradle plugin which creates parsers using SableCC. SableCC supports automatic CST-to-AST transformation, emits all the visitor patterns and analysis helpers you will likely ever need, and is LR, not LL(k). Many example grammars are available for modern languages; the author of this plugin has written dozens.

Download No website Updated 02 Apr 2013 cardme

Screenshot
Pop 56.35
Vit 6.52

cardme is a Java library implementation of RFC 2426, VCard. It provides Java applications with a way to read and write from and to the VCard file format. The project's goals are to provide a flexible and easy to use library with excellent documentation.

No download No website Updated 28 Mar 2010 jmdb

Screenshot
Pop 26.53
Vit 1.00

jmdb is a Java library for retrieving and parsing information from imdb.com.

Download No website Updated 15 Sep 2011 Yap4j

Screenshot
Pop 23.11
Vit 30.80

Yap4j is the simplest library for parsing CSV files in Java. It deserializes CSV files into a list of POJOs using a set of Java annotations, while allowing you to specify Object-CSV mappings. It automatically converts to and from a wide range of data types, and includes support for types from popular libraries such as Joda Time, and support for custom record delimiters.

Download No website Updated 22 Apr 2009 stupid-xml

Screenshot
Pop 18.14
Vit 42.79

stupid-xml is a ridiculously simple annotation-based XML stream parser for Java. The main goal of this project is to get the strings you care about out of XML and into Java as quickly as possible. You define a simple model class, specify the relative paths for its fields, and it will start generating instances for you from an XML stream. The functionality is limited. It will only parse Strings into your model, but this keeps everything extremely simple. Once you have the Strings in your model, you can perform filtering or more complex conversions.

Download Website Updated 12 Feb 2013 HtmlCleaner

Screenshot
Pop 16.97
Vit 20.77

HtmlCleaner is an HTML parser. HTML found on the Web is usually dirty, ill-formed, and unsuitable for further processing. For any serious consumption of such documents, it is necessary to first clean up the mess and bring order to the tags, attributes, and ordinary text. For a given HTML document, HtmlCleaner reorders individual elements and produces well-formed XML. By default, it follows rules similar to those which most Web browsers use to create a Document Object Model. However, the user may provide custom tag and rule sets for tag filtering and balancing.

Screenshot

Project Spotlight

Griffon IDE

An IDE for HTML, Bash, Perl, PHP, C, etc.

Screenshot

Project Spotlight

GroupServer

A Web-based mailing list manager and collaboration server.