236 projects tagged "Text Processing"
GNU Aspell is a spell checker designed to eventually replace Ispell. It can either be used as a library or as an independent spell checker. Its main feature is that it does a superior job of suggesting possible replacements for a misspelled word than just about any other spell checker out there for the English language. Unlike Ispell, Aspell can also easily check documents in UTF-8 without having to use a special dictionary. Aspell will also do its best to respect the current locale setting. Other advantages over Ispell include support for using multiple dictionaries at once and intelligently handling personal dictionaries when more than one Aspell process is open at once.
Ciao is a complete Prolog system subsuming ISO-Prolog with a novel modular design which allows both restricting and extending the language. Ciao extensions currently include feature terms (records), higher-order, functions, constraints, objects, persistent predicates, a good base for distributed execution (agents), and concurrency. Libraries also support WWW programming, sockets, and external interfaces (C, Java, TCL/Tk, relational databases, etc.). An Emacs-based environment, a stand-alone compiler, and a toplevel shell are also provided.
FriBidi is a free implementation of the Unicode Bidirectional (BiDi) Algorithm. It also provides utility functions to aid in the development of interactive editors and widgets that implement BiDi functionality. The BiDi algorithm is a prerequesite for supporting right-to-left scripts such as Hebrew, Arabic, Syriac, and Thaana.
The fontutils package includes the programs bpltobzr, bzrto, charspace, fontconvert, gsrenderfont, imageto, imgrotate, limn, and xbfe. These create fonts for use with Ghostscript or TeX (starting with a scanned type image and converting the bitmaps to outlines), convert between font formats, etc. The package also includes the libraries libbzr.a, libgf.a, libpbm.a, libpk.a, libtfm.a, and libwidgets.a.
GPP is a general-purpose preprocessor with customizable syntax, suitable for a wide range of preprocessing tasks. Its independence from any programming language makes it much more versatile than cpp, while its syntax is lighter and more flexible than that of m4. The syntax is fully customizable, which makes it possible to process text files, HTML, or source code equally efficiently in a variety of languages.
Grok is a library of Java components for performing various natural language tasks. These include several preprocessing tasks, chart parsing, a large categorial grammar for English (induced from the Penn treebank), and some knowledge representation components (basic coreference, salience tracking, etc.). The library also has a companion kit which provides a GUI interface to the components, several of which are implementations of interfaces in the Quipu OpenNLP API.
An AJAX Webmail script for an existing POP3/IMAP/SMTP server or cPanel.