WeOCR is a platform for Web-enabled OCR (Optical Character Reader/Recognition) systems. It enables people to use character recognition over networks. A WeOCR server receives document images from users, recognizes text in the images, and returns recognition results to the users. WeOCR does not have its own character recognition engine. Instead, it is intended to accommodate various existing character recognition engines.
FuzzyIndex indexes text for performing fuzzy searches using PHP and SQLite. It can process a list of text strings and build a database which indexes snippets of those strings and the locations where they appear. The class can also search for given keywords and returns the locations of the indexed strings where the best-matching text appears. It uses SQLite to store the indexed text database, but the class can be extended to use a different database type. It uses certain heuristics to extract the snippets from the indexed text. These heuristics are implemented as separate classes which can be used interchangeably.
lhs2TeX is a literate programming tool. It is implemented as a preprocessor that generates LaTeX code from literate Haskell sources. It allows for and provides different styles for the formatting of code. You can easily select between representing operators with mathematical symbols or with ASCII approximations, as well as deciding whether or not to highlight keywords. The formatting of your own defined tokens may be adjusted. Preprocessor-style conditionals are supported, and Haskell can be used to generate parts of the document.
Catdoc is a MS Word file decoding tool that doesn't attempt to analyze file formatting (it just extracts readable text), but is able to handle all versions of Word and convert character encodings. A Tcl/Tk graphical viewer is also included. It can also read RTF files and convert Excel and PowerPoint files.
PyBison is a sophisticated yet easy-to-use parser creation toolkit for Python that interfaces directly to Bison (yacc)-based parsers. It provides full LALR(1) grammar support, allowing for simple parsing tasks through to writing compilers for high-level languages. Parser code is automatically generated from rules within user-created Parser classes (written in Python), and then, compiled, yacc'ed and linked into a shared library, which is loaded into the running process. All this happens automatically. When the parser runs, it connects directly with the yyparse() routine, and takes event callbacks upon parse targets being reached.
Libxslt is a C library for GNOME which allows developers to work with XSLT. It is based on libxml for XML parsing, tree manipulation, and XPath support. Also included is 'xsltproc', a command line XSLT processor. The library is written in plain C, making as few assumptions as possible, and sticking closely to ANSI C/POSIX for easy embedding. It should work on Linux, Unix, and Windows. Though not designed primarily with performances in mind, libxslt seems to be a relatively fast processor. It also include full support for the EXSLT set of extension functions as well as some common extensions present in other XSLT engines.
FIGlet is a program for making large letters out of ordinary text. It prints its input using large characters made up of ordinary screen characters. FIGlet output is generally reminiscent of the sort of "signatures" many people like to put at the end of email and UseNet messages. It is also reminiscent of the output of some banner programs, although it is oriented normally, not sideways.