ANTLR (ANother Tool for Language Recognition) is a language tool that provides a framework for constructing recognizers, compilers, and translators from grammatical descriptions containing C++, Java, or Sather actions. It is similar to the popular compiler generator YACC, however ANTLR is much more powerful and easy to use. ANTLR-produced parsers are not only highly efficient, but are both human-readable and human-debuggable (especially with the interactive ParseView debugging tool). ANTLR can generate parsers, lexers, and tree-parsers in either C++, Java, or Sather. ANTLR is currently written in Java.
Apache Cocoon is a Web development framework built around the concepts of separation of concerns and component-based Web development. Cocoon implements these concepts around the notion of "component pipelines", each component on the pipeline specializing on a particular operation. This makes it possible to use a Lego(tm)-like approach in building Web solutions, hooking together components into pipelines without any required programming.
Document Structure Description (DSD) is a simple but expressive grammar notation for XML documents. This new XML schema language is result of a research collaboration between AT&T Labs, NJ and BRICS at the University of Aarhus, Denmark. The technology is based on general and familiar concepts that allow much stronger document descriptions than possible with DTDs or XML schemas.
DSML is the Directory Services Markup Language, an XML dialect for working with directory information. The DSML Tools provide for the querying of any LDAP directory (with search results output as DSML), the importing of DSML data into any LDAP directory, the directory-context validation of DSML (checking for illegal attributes in the entries, etc.), and the calculation of the differences (for a directory) between two DSML documents to provide an XML Diff algorithm for DSML data. This software makes all LDAP-supporting directories DSML-enabled. It can also check the integrity of DSML data, and show at a glance how two data sets, represented as DSML, differ.
FreeMarker is a template engine that was originally designed so that servlet-based applications could keep graphical design separate from application logic. The templates provide an easy and highly flexible way to generate any kind of text output (HTML, PostScript, TeX, source code, etc.) from a variety of data sources such as Java objects, Jython objects, XML object models, and more.
GPP is a general-purpose preprocessor with customizable syntax, suitable for a wide range of preprocessing tasks. Its independence from any programming language makes it much more versatile than cpp, while its syntax is lighter and more flexible than that of m4. The syntax is fully customizable, which makes it possible to process text files, HTML, or source code equally efficiently in a variety of languages.
Grok is a library of Java components for performing various natural language tasks. These include several preprocessing tasks, chart parsing, a large categorial grammar for English (induced from the Penn treebank), and some knowledge representation components (basic coreference, salience tracking, etc.). The library also has a companion kit which provides a GUI interface to the components, several of which are implementations of interfaces in the Quipu OpenNLP API.