RSS 1820 projects tagged "Text Processing"

Download Website Updated 02 Apr 2006 Glimpse

Screenshot
Pop 52.99
Vit 2.78

Glimpse is a very powerful indexing and querying system that allows you to search through all your files very quickly. It can be used by individuals for their personal file systems as well as by organizations for large data collections. Glimpse is the default search engine in Harvest.

Download Website Updated 30 Mar 2012 GNU awk

Screenshot
Pop 506.04
Vit 10.51

The awk utility interprets a special-purpose programming language that makes it possible to handle simple data-reformatting jobs with just a few lines of code.

Download Website Updated 12 Mar 2011 GNU m4

Screenshot
Pop 427.27
Vit 7.29

GNU m4 is an implementation of the traditional Unix macro processor. It is mostly SVR4 compatible, although it has some extensions (for example, handling more than 9 positional parameters to macros). GNU m4 also has built-in functions for including files, running shell commands, doing arithmetic, etc. Autoconf needs GNU m4 for generating `configure' scripts, but not for running them.

Download Website Updated 27 Mar 2013 GNU TeXmacs

Screenshot
Pop 1,119.80
Vit 132.02

GNU TeXmacs is a free wysiwyw (what you see is what you want) editing platform with special features for scientists. The software aims to provide a unified and user friendly framework for editing structured documents with different types of content: text, mathematics, graphics, interactive content. TeXmacs can also be used as an interface to many external systems for computer algebra, numerical analysis, and statistics. New presentation styles can be written by the user and new features can be added to the editor using Scheme.

No download Website Updated 19 Dec 2006 gocr

Screenshot
Pop 153.80
Vit 4.30

GOCR is optical character recognition software. It converts PNM files into ASCII files.

Download Website Updated 19 Sep 2004 GPP

Screenshot
Pop 224.37
Vit 4.15

GPP is a general-purpose preprocessor with customizable syntax, suitable for a wide range of preprocessing tasks. Its independence from any programming language makes it much more versatile than cpp, while its syntax is lighter and more flexible than that of m4. The syntax is fully customizable, which makes it possible to process text files, HTML, or source code equally efficiently in a variety of languages.

Download Website Updated 29 Jan 2010 Groff

Screenshot
Pop 322.43
Vit 5.70

The Groff package contains the traditional UN*X text formatting tools troff, nroff, tbl, eqn, and pic. These utilities, together with the man package, are essential for displaying the online manual pages. Output can be produced in a number of formats including plain ASCII and PostScript. All the standard macro packages are supported. A number of other utilities are also included together with several fonts.

Download Website Updated 05 Dec 2001 Grok

Screenshot
Pop 52.84
Vit 2.38

Grok is a library of Java components for performing various natural language tasks. These include several preprocessing tasks, chart parsing, a large categorial grammar for English (induced from the Penn treebank), and some knowledge representation components (basic coreference, salience tracking, etc.). The library also has a companion kit which provides a GUI interface to the components, several of which are implementations of interfaces in the Quipu OpenNLP API.

Download No website Updated 16 Mar 2004 gtkkanjipad

Screenshot
Pop 35.27
Vit 1.19

gtkkanjipad is a GTK widget for Japanese (kanji), and limited Chinese (hanzi), handwriting recognition. It is mostly based on Owen Taylor's KanjiPad, and includes Perl bindings.

Download Website Updated 08 Feb 2001 Guava

Screenshot
Pop 38.30
Vit 2.57

The Guava tools are a set of Perl scripts for HTML pre-processing. You can create multi-page documents with contents tables, or use templates to give a consistent look to a set of pages. All output is passed through the C preprocessor, so you can use directives such as #include, #define and #if. There are also built-in macros for producing dates, cross references, etc.

Screenshot

Project Spotlight

BallroomDJ

A ballroom music player.

Screenshot

Project Spotlight

KeyBox

A Web-based multi-terminal and SSH management tool.