51 projects tagged "Text Processing"
TXR is a new data munging language to replace the likes of awk and Perl. TXR's special pattern language provides template-based matching of entire documents or large sections of documents. It also contains a language for functional and imperative programming. It is written in C and takes the form of a utility that is portable to Unix-like platforms and Windows.
ClearParse is a flexible engine that can be used for any parsing task including interpreting or compiling programming languages, analyzing or converting data files, processing command line parameters and user input, implementing markup languages and scripts, natural language processing (NLP), and more.
PolyGen is a program for generating random sentences according to a grammar definition, that is following custom syntactical and lexical rules. Formally, it is an interpreter of a language itself designed to define languages, where to interpret means executing a source program in real time and eventually outputting its result. Here, a source program is a grammar definition. The execution consists of the exploration of such grammar by selecting a random path, and the result is the sentence built on the way.
SPindent (Server Page Indenter) is a JSP/PHP structural validator and indenter. It performs structural compatibility check of inner HTML generated from "parallel" branches of process flow statements such as if/else. It allows for those HTML branches to have different entry and exit HTML stack points, as far as the branches are compatible. This allows for verification and proper indentation of handy workarounds, as well as rusty pyramids. It is based on MixedCC (Mixed Compiler Compiler).
pyPEG is a quick and easy solution for creating a parser in Python programs. pyPEG uses a PEG language in Python data structures to parse, so it can be used dynamically to parse nearly every context free language. The output is a plain Python data structure called pyAST, or, as an alternative, XML.
Frink is a calculating tool and programming language designed to help you in the real world. It tracks units of measurement throughout all calculations and ensures that answers are correct. It converts between systems of measurement, and has a huge library of physical data. It is both a simple calculator for quick calculations and a full-fledged programming language for large tasks. It draws high-quality graphics, handles conversions between time zones, currencies, and historical values of the U.S. dollar and the British pound, translates between several languages, does date/time math, and more.
CodeWorker is a versatile parsing tool and a universal source code generator. It interprets a scripting language for producing reusable, tailor-made, evolving, and reliable IT systems with a high level of automation. The file formats to parse are described in an extended-BNF syntax. Template-based scripts drive the writing of patterns for generating code or text. The code generation knows how to preserve protected areas with hand-typed code and provides code expansion, source-to-source translation, and program transformation. It provides a native translation of CodeWorker's scripts in C++.