RSS 1159 projects tagged "Text Processing"

Download Website Updated 21 Apr 2014 TCPDF

Screenshot
Pop 1,612.47
Vit 2,659.65

TCPDF is a PHP class for generating PDF documents without requiring external extensions. TCPDF supports all ISO page formats and custom page formats, custom margins and units of measure, UTF-8 Unicode, RTL languages, HTML, barcodes, TrueTypeUnicode, TrueType, OpenType, Type1, and CID-0 fonts, images, graphic functions, clipping, bookmarks, JavaScript, forms, page compression, digital signatures, and encryption.

Download Website Updated 20 Apr 2014 Asymptote

Screenshot
Pop 725.50
Vit 451.06

Asymptote is a powerful descriptive 2D and 3D vector graphics language for technical drawing, inspired by MetaPost but with an improved C++-like syntax. It provides for figures the same high-quality level of typesetting that LaTeX does for scientific text. Asymptote is a programming language as opposed to just a graphics program. It can exploit the best features of script (command-driven) and graphical user interface (GUI) methods. High-level graphics commands are implemented in the language itself, allowing them to be easily tailored to specific applications.

Download Website Updated 10 Apr 2014 OpenGrok

Screenshot
Pop 374.41
Vit 50.38

OpenGrok is a fast and usable source code search and cross reference engine. It helps you search, cross-reference, and navigate your source tree. It can understand various program file formats and version control histories like Mercurial, Bazaar, Git, ClearCase, Perforce, SCCS, RCS, CVS, or Subversion. In other words, it lets you grok (profoundly understand) the source.

Download Website Updated 09 Apr 2014 Highlight

Screenshot
Pop 930.19
Vit 197.79

Highlight is a universal converter from source code to HTML, XHTML, RTF, TeX, LaTeX, SVG, BBCode, and terminal escape sequences. (X)HTML and SVG output are formatted by Cascading Style Sheets. It supports more than 170 programming languages, and includes 80 highlighting color themes. The configuration files are Lua scripts with plug-in support. The converter includes some features to provide a consistent layout of the output code.

Download Website Updated 07 Apr 2014 Docx to Text Converter (docx2txt)

Screenshot
Pop 206.11
Vit 32.86

docx2txt is a tool that attempts to generate equivalent text files from Microsoft .docx documents, preserving some formatting and document information (which MS text conversion drops) along with appropriate character conversions for a good (ASCII) text experience. It is a platform independent solution consisting of (core) Perl and (wrapper) Unix/Windows shell scripts and a configuration file to control the output text appearance to fair extent. It can very conveniently be used to build a Web based docx document conversion service. Some Makefiles and Windows batch files are provided for easy installation of the scripts. With unzippers like CakeCmd that can deal with corrupt Zip archives, this tool can extract text from corrupt docx documents in many cases, where MS word processor fails to even open them.

Download Website Updated 06 Apr 2014 Vrapper

Screenshot
Pop 305.80
Vit 54.45

Vrapper is an Eclipse plugin which acts as a wrapper for Eclipse text editors to provide a Vim-like input scheme for moving around and editing text. Unlike other plugins which embed Vim in Eclipse, Vrapper imitates the behavior of Vim while still using whatever editor you have opened in the workbench. The goal is to have the comfort and ease which comes with the different modes, complex commands, and count/operator/motion combinations which are the key features behind editing with Vim, while preserving the powerful features of the different Eclipse text editors, like code generation and refactoring.

Download Website Updated 05 Apr 2014 TXR

Screenshot
Pop 720.53
Vit 84.11

TXR is a new data munging language. TXR's special pattern language provides template-based matching of entire documents or large sections of documents. It also contains a language for functional and imperative programming. It is written in C and takes the form of a utility that is portable to Unix-like platforms and Windows.

Download Website Updated 04 Apr 2014 Terrier

Screenshot
Pop 213.33
Vit 40.27

Terrier is software for the rapid development of Web, intranet, and desktop search engines. More generally, it is a modular platform for building large-scale information retrieval applications, providing indexing and probabilistic retrieval functionalities. It comes with a desktop search application.

Download Website Updated 04 Apr 2014 PCRE

Screenshot
Pop 875.21
Vit 119.73

The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5, with just a few differences. PCRE is used by many programs, including Exim, Postfix, and PHP.

No download No website Updated 02 Apr 2014 Template Data Interface (TDI)

Screenshot
Pop 90.62
Vit 9.98

Template Data Interface (TDI, /ʹtedɪ/) is a markup templating system written in Python with (optional but recommended) speedup code written in C. Unlike most templating systems, TDI does not invent its own language to provide functionality. Instead, you simply mark the nodes you want to manipulate within the template document. The template is parsed, and the marked nodes are presented to your Python code, where they can be modified in any way you want.

Screenshot

Project Spotlight

Tsung

A distributed multi-protocol load testing tool.

Screenshot

Project Spotlight

Webalizer Xtended

A Web server log analysis program, forked from Webalizer.