1998 projects tagged "Text Processing"
The ALICE software implements AIML (Artificial Intelligence Markup Language), a non-standard evolving markup language for creating chat robots. The primary design feature of AIML is minimalism. Compared with other chat robot languages, AIML is perhaps the simplest. The pattern matching language is very simple, for example permitting only one wild-card ('*') match character per pattern. AIML is an XML language, implying that it obeys certain grammatical meta-rules. The choice of XML syntax permits integration with other tools such as XML editors. Another motivation for XML is its familiar look and feel, especially to people with HTML experience.
a2ps is an Any to PostScript filter. Of course it processes plain text files, but also pretty prints quite a few popular languages (66). Moreover it has the ability to delegate the processing of some files to other filters (such as groff, texi2dvi, dvips, gzip etc.), which allows a uniform treatment (n-up, page selection etc.) of heterogeneous files.
AFT (Almost Free Text) is a document preparation system. It is mostly free form, meaning that there is little intrusive markup; AFT source documents look a lot like plain old ASCII text. It has a few rules for structuring your document, more to do with formatting your text than embedding lots of commands, and it produces all types of output (HTML, XHTML, LaTeX, roll-your-own XML, etc.). All that needs to be done is to edit a rule file. You can even customize your own rule files for specialized output.
ANTLR (ANother Tool for Language Recognition) is a language tool that provides a framework for constructing recognizers, compilers, and translators from grammatical descriptions containing C++, Java, or Sather actions. It is similar to the popular compiler generator YACC, however ANTLR is much more powerful and easy to use. ANTLR-produced parsers are not only highly efficient, but are both human-readable and human-debuggable (especially with the interactive ParseView debugging tool). ANTLR can generate parsers, lexers, and tree-parsers in either C++, Java, or Sather. ANTLR is currently written in Java.
"Ball", the Byzantine Askemos Language Layer, is an intrusion resistant and incorruptible, autonomous distributed operating system. It provides application programmers with continuations, messages, and rights management on top of a peer-to-peer network resisting byzantine failures of network nodes. The API significantly raises the level of abstraction in comparison with other operating systems: there are very few system calls, and these are expressed in XML. An alternative understanding of Askemos is that of an XML object database with stored procedures.
GNU Aspell is a spell checker designed to eventually replace Ispell. It can either be used as a library or as an independent spell checker. Its main feature is that it does a superior job of suggesting possible replacements for a misspelled word than just about any other spell checker out there for the English language. Unlike Ispell, Aspell can also easily check documents in UTF-8 without having to use a special dictionary. Aspell will also do its best to respect the current locale setting. Other advantages over Ispell include support for using multiple dictionaries at once and intelligently handling personal dictionaries when more than one Aspell process is open at once.
An AJAX Webmail script for an existing POP3/IMAP/SMTP server or cPanel.