Apache OpenNLP is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services.
The Parsing Expression Grammar Template Library (PEGTL) is a C++0x library for creating parsers according to a Parsing Expression Grammar (PEG). Grammars are embedded as regular C++ code, created with template programming (not template meta programming). These hierarchies naturally correspond to the inductive definition of PEGs. The library extends on the subject of PEGs with new expression types, actions that can be attached to grammar rules, and mechanisms to ensure helpful diagnostics in case of parsing errors. PEGs are superficially similar to Context-Free Grammars (CFGs).
WTMParse is a script originally intended for use in forensic examinations which parses WTMP files from Unix-like operating systems and generates a CSS-styled HTML report containing the login terminal, username, log start date, and login time/date in a table. It's good for postmortem forensic examinations or as a way of getting "last"-like information when you don't have the ability to boot the machine in question but can grab the wtmp.
YAJL (Yet Another JSON Library) is a small event-driven (SAX-style) JSON parser written in ANSI C, and a small validating JSON generator. It's highly portable, data representation independent, fast, generates verbose error messages including context of where the error occurs in the input text, can parse JSON data incrementally off a stream, and is tiny.
lihata is a compact textual language which can represent a tree of lists, hashes, and tables. The syntax tries to be minimal and flexible to allow formatting a lihata file to fit the context it represents. The source release contains an event and DoM parser and helper functions for maintaining lihata trees. lihata is a convenient language for both simple and complex configuration files and text representation of data files.