RSS 871 projects tagged "Text Processing"

Download Website Updated 11 Apr 2014 Vilistextum

Screenshot
Pop 76.99
Vit 4.94

Vilistextum is a small and fast HTML to text converter. It is quite fault-tolerant and deals well with badly-formed or otherwise quirky HTML. It has full support for different character sets (e.g. Unicode). It is able to optimize for ebook reading, collapse multiple blank lines, and create footnotes out of links. A GUI frontend using kaptain is included.

Download Website Updated 09 Apr 2014 Highlight

Screenshot
Pop 1,038.59
Vit 267.33

Highlight is a universal converter from source code to HTML, XHTML, RTF, TeX, LaTeX, SVG, BBCode, and terminal escape sequences. (X)HTML and SVG output are formatted by Cascading Style Sheets. It supports more than 170 programming languages, and includes 80 highlighting color themes. The configuration files are Lua scripts with plug-in support. The converter includes some features to provide a consistent layout of the output code.

Download Website Updated 07 Apr 2014 Verbiste

Screenshot
Pop 338.55
Vit 128.50

Verbiste is a French conjugation system implemented as a C++ library, a GNOME applet, and two command-line tools. It can conjugate verbs and analyze conjugated verbs to determine their mode, tense, and person. The knowledge base contains over 6700 verbs.

Download Website Updated 04 Apr 2014 PCRE

Screenshot
Pop 854.76
Vit 145.62

The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5, with just a few differences. PCRE is used by many programs, including Exim, Postfix, and PHP.

Download Website Updated 01 Apr 2014 gjots

Screenshot
Pop 465.03
Vit 98.76

gjots lets you organize text notes in a convenient, hierarchical way. It can be used for notes, jottings, bits and pieces, recipes, and even PINs and passwords, using encryption. It can also be used to "mind-map" larger compositions like manuals, Web pages, articles, etc. It is a bit like the KDE program "kjots", but uses the GTK library and supports a hierarchy of folders. Files can be output to HTML with an automatic table of contents or to docbook XML. Encryption is supported with ccrypt(1), gpg(1), and openssl(1), so that musings can be kept private.

Download Website Updated 19 Mar 2014 doclifter

Screenshot
Pop 338.76
Vit 59.30

doclifter helps with lifting documents with nroff markup to XML-DocBook. Lifting documents from presentation level to semantic level is hard, and a really good job requires human polishing. This tool aims to do everything that can be mechanized, and to preserve any troff-level information that might have structural implications in XML comments. TBL tables are translated into DocBook table markup, PIC into SVG, and EQN into MathML (relying on pic2svg and GNU eqn for the last two).

No download Website Updated 06 Mar 2014 Java Serialization to XML

Screenshot
Pop 598.67
Vit 93.29

JSX serializes Java objects to XML. You can persist objects, evolve them, and send them over the network and between applications. Your object data becomes human-readable and human-writable. You can test it, search it, profile it, audit it, and edit it with ordinary text and XML tools. JSX handles all POJOs and also all classes that require Java's own object serialization.

Download Website Updated 28 Feb 2014 DocBook Doclet

Screenshot
Pop 756.12
Vit 103.02

DocBook Doclet is a javadoc doclet that creates DocBook XML and UML class diagrams from Javadoc.

Download Website Updated 23 Feb 2014 loook

Screenshot
Pop 312.03
Vit 19.49

Loook searches for text strings in OpenOffice.org (and StarOffice 6.0 or later) files. AND, OR, and phrase searches are supported. It doesn't create an index, but searching should be fast enough, unless you have very many files.

No download Website Updated 17 Feb 2014 iText

Screenshot
Pop 759.04
Vit 71.76

iText is a library that contains classes to generate and manipulate documents in the Portable Document Format (PDF). Document manipulation includes splitting, merging, and filling out forms (AcroForms, static and dynamic XFA forms).

Screenshot

Project Spotlight

Wing IDE

An IDE for Python.

Screenshot

Project Spotlight

naken_asm

A microcontroller assembler.