gradle-sablecc-plugin is a gradle plugin which creates parsers using SableCC. SableCC supports automatic CST-to-AST transformation, emits all the visitor patterns and analysis helpers you will likely ever need, and is LR, not LL(k). Many example grammars are available for modern languages; the author of this plugin has written dozens.
jsoup is a Java library for working with real-world HTML. It can parse HTML from a URL, file, or string. It can find and extract data, using DOM traversal or CSS selectors. The HTML elements, attributes, and text can be manipulated. It can clean user-submitted content against a safe white-list. jsoup is designed to deal with all varieties of HTML found in the wild, from pristine and validating to invalid tag-soup; jsoup will create a sensible parse tree.
lihata is a compact textual language which can represent a tree of lists, hashes, and tables. The syntax tries to be minimal and flexible to allow formatting a lihata file to fit the context it represents. The source release contains an event and DoM parser and helper functions for maintaining lihata trees. lihata is a convenient language for both simple and complex configuration files and text representation of data files.
PHP Emoticon Parser can replace emoticon text with HTML image tags. It can search for emoticon text characters in a given text string and replace them with equivalent emoticon images. The emoticon text and image mappings are defined in a separate script that maps emoticon names to the different equivalent representations for emoticon text symbols.
WTMParse is a script originally intended for use in forensic examinations which parses WTMP files from Unix-like operating systems and generates a CSS-styled HTML report containing the login terminal, username, log start date, and login time/date in a table. It's good for postmortem forensic examinations or as a way of getting "last"-like information when you don't have the ability to boot the machine in question but can grab the wtmp.