RSS 36 projects tagged "Text Processing"

Download Website Updated 19 Oct 2002 colour_trace

Screenshot
Pop 43.61
Vit 1.87

colour_trace adds ANSI escape sequences to a developer's trace messages containing hex printouts of pointer or reference values, so that each value gets its own unique colour and you can see quickly whether two hex values are the same.

Download Website Updated 30 Jan 2001 dtd2latex

Screenshot
Pop 23.00
Vit 1.44

dtd2latex is a quick hack which converts a commented XML DTD into LaTeX source for printing.

Download Website Updated 23 Aug 2001 htp

Screenshot
Pop 15.68
Vit 1.46

Htp is a pre-processor for HTML. Pages can be written with HTML-like macros, which htp expands. This makes it easy to maintain a consistent look over a collection of Web pages.

Download Website Updated 21 May 2002 qmail-smtpd-auth

Screenshot
Pop 67.48
Vit 2.67

qmail-smtpd-auth is a patch for qmail that enables it to support SMTP AUTH protocol with the following auth types: LOGIN, PLAIN and CRAM-MD5. It's based on a previous patch by Mr.Brisby that implemented LOGIN type. This version is enhanced and allows easy adding of new auth methods.

Download Website Updated 30 Jan 2001 SmartHTML

Screenshot
Pop 19.49
Vit 1.43

SmartHTML is yet another HTML preprocessor. It allows you to write Web pages in a language more sane than HTML: paragraphs are automatically inserted when two consecutive newlines are encountered. Special symbols like < & etc. are automatically replaced. Tags look like texinfo: @tagname { contents }. Written in Perl, it allows tags to run Perl subroutines on the contents, allowing e.g. automatic generation of a table of contents, smarter generation of links, macros which can define HTML code which has to be maintained in only one place, etc.

Download Website Updated 13 May 2008 yawl

Screenshot
Pop 84.99
Vit 3.52

This is a comprehensive "word game" word list for UNIX/Linux. It is a superset of the author's ENABLE list, the "OSW", and various lists researched by the author's colleague, Alan Beale. At 264,093 words, it is the largest list of its kind, suitable for use in all manners of crossword-type board games and word construction games, as well as for a spell checker dictionary. The YAWL package now includes two anagramming utilities (supplied as source code, handled by the included Makefile). There is also a shell script that extends the UNIX "strings" system command. This is the word list package recommended for the author's Quackey word game.

Download Website Updated 07 Mar 2001 pee-portal

Screenshot
Pop 10.39
Vit 69.26

pee-portal reads RDF files from a directory and displays the result on a Web page. It is written in PET (Perl Embedded Template), a template syntax for PEE (Perl Embedding Engine). This is not intended for production use, just as a demonstration of PET syntax.

No download Website Updated 16 Sep 2004 Functional XML Parsing Framework

Screenshot
Pop 50.08
Vit 2.02

The Functional XML Parsing Framework is a package of low-to-high-level lexing and parsing procedures that can be combined to yield a SAX, DOM, validating parsers, or a parser intended for a particular document type. The procedures in the package can be used separately to tokenize or parse various pieces of XML documents. The package supports XML namespaces, character, internal, and external parsed entities, xml:space, attribute value normalization, processing instructions and CDATA sections. It is intended to be a framework, a set of "Lego blocks" you can use to build a parser that follows DOM, SAX, or another discipline, and performs validation to any degree. As an example of such parser construction, the package includes a semi-validating SXML parser. It converts XML to SXML, an instance of XML Infoset as S-expressions, an abstract syntax tree of an XML document. SXML can be queried (in a XPath style), transformed, and evaluated. The framework parses XML in a pure functional style, as folding over a text XML document considered a spread-out tree. The input port is treated as a linear, read-once parameter. The framework's code does not use assignments at all.

Download Website Updated 31 Aug 2001 demoroniser

Screenshot
Pop 36.18
Vit 1.00

Demoroniser is a Perl script which attempts to fix the gratuitously incompatible HTML generated by Microsoft applications. Many Microsoft programs use an 'enhanced' version of Latin-1 with extra characters like quotation marks and dashes. Sometimes people paste these characters into supposedly ASCII or Latin-1 web pages, resulting in pages that don't display properly on non-MS platforms. Demoroniser replaces these MS characters with standard ASCII equivalents. It also fixes up wrongly nested tags generated by HTML export in some MS applications.

Download Website Updated 24 Oct 2001 gelapas

Screenshot
Pop 19.75
Vit 1.00

gelapas crawls the file tree and extracts information from files. The default settings (and the shorthand options) are useful to extract information such as the title or meta tags from HTML files, but it could also be used for other kind of documents.

Screenshot

Project Spotlight

Highlight

A universal source code to formatted text converter.

Screenshot

Project Spotlight

libdvbpsi

A library designed for MPEG TS and DVB PSI tables decoding and generation.