RSS 840 projects tagged "Text Processing"

Download Website Updated 10 Apr 2014 TCPDF

Screenshot
Pop 1,551.17
Vit 570.28

TCPDF is a PHP class for generating PDF documents without requiring external extensions. TCPDF supports all ISO page formats and custom page formats, custom margins and units of measure, UTF-8 Unicode, RTL languages, HTML, barcodes, TrueTypeUnicode, TrueType, OpenType, Type1, and CID-0 fonts, images, graphic functions, clipping, bookmarks, JavaScript, forms, page compression, digital signatures, and encryption.

Download Website Updated 08 Apr 2014 AutoLaTeX

Screenshot
Pop 641.64
Vit 143.82

AutoLaTeX is a tool for managing small to large LaTeX documents. It detects which files which are used to build the document (included TeX files, BibTeX, figures, etc.), and launches the various different tools (latex, bibtex, makeindex) when the sources files have been changed. It provides translation rules which automatically generate figures in EPS, PNG, or PDF formats from different types of sources (dia, xfig, svg, astah, source code, etc.) AutoLaTeX also provides graphical user interfaces, a plugin for the editors Gedit and Sublime Text, and a standalone Gtk application.

Download Website Updated 07 Apr 2014 Docx to Text Converter (docx2txt)

Screenshot
Pop 190.37
Vit 42.02

docx2txt is a tool that attempts to generate equivalent text files from Microsoft .docx documents, preserving some formatting and document information (which MS text conversion drops) along with appropriate character conversions for a good (ASCII) text experience. It is a platform independent solution consisting of (core) Perl and (wrapper) Unix/Windows shell scripts and a configuration file to control the output text appearance to fair extent. It can very conveniently be used to build a Web based docx document conversion service. Some Makefiles and Windows batch files are provided for easy installation of the scripts. With unzippers like CakeCmd that can deal with corrupt Zip archives, this tool can extract text from corrupt docx documents in many cases, where MS word processor fails to even open them.

Download Website Updated 04 Apr 2014 PCRE

Screenshot
Pop 854.76
Vit 145.62

The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5, with just a few differences. PCRE is used by many programs, including Exim, Postfix, and PHP.

No download No website Updated 02 Apr 2014 Template Data Interface (TDI)

Screenshot
Pop 87.77
Vit 11.80

Template Data Interface (TDI, /ʹtedɪ/) is a markup templating system written in Python with (optional but recommended) speedup code written in C. Unlike most templating systems, TDI does not invent its own language to provide functionality. Instead, you simply mark the nodes you want to manipulate within the template document. The template is parsed, and the marked nodes are presented to your Python code, where they can be modified in any way you want.

Download Website Updated 02 Apr 2014 Barcode Writer in Pure Postscript

Screenshot
Pop 1,035.70
Vit 28.23

Barcode Writer in Pure Postscript implements the printing of many barcode formats entirely within PostScript, so that the process of converting the input string into the printed output is performed by the printer or print system. The project supports all major barcode formats including: EAN-13 (JAN-13), EAN-8 (JAN-8), UPC-A, UPC-E, EAN-5 & EAN-2 (EAN/UPC add-ons), ISBN (including legacy ISBN), ISMN (including legacy ISMN), ISSN, Code 128 (A, B & C), GS1-128, SSCC-18 (EAN-18, NVE), EAN-14, Code 39, Code 39 Extended, Code 93, Code 93 Extended, Code 32 (Italian Pharmacode), Pharmazentralnummer (PZN), Interleaved 2 of 5, ITF-14 (UPC SCS), GS1 DataBar (Omnidirectional, Stacked, Stacked Omnidirectional, Limited, Expanded, Expanded Stacked), Code 2 of 5 (Industrial, IATA, Matrix, Datalogic & COOP), Code 11 (USD-8), BC412, Codabar (NW-7), Pharmacode (including two-track), MSI, Plessey, Telepen, Channel Code, PosiCode, PDF417, Data Matrix (ECC200), QR Code (including Micro QR Code), and more.

Download Website Updated 19 Mar 2014 doclifter

Screenshot
Pop 338.76
Vit 59.30

doclifter helps with lifting documents with nroff markup to XML-DocBook. Lifting documents from presentation level to semantic level is hard, and a really good job requires human polishing. This tool aims to do everything that can be mechanized, and to preserve any troff-level information that might have structural implications in XML comments. TBL tables are translated into DocBook table markup, PIC into SVG, and EQN into MathML (relying on pic2svg and GNU eqn for the last two).

Download Website Updated 15 Mar 2014 psx

Screenshot
Pop 200.85
Vit 40.11

PSX is a PHP framework for creating RESTful APIs. It helps you to build clean URLs serving Web standard formats like JSON, XML, Atom, and RSS. It includes a handler system that abstracts away SQL queries from domain logic, a routing system that executes correct controller method for the location of the controller and the method annotation, and a flexible data system that converts database records into formats like JSON, XML, Atom, and RSS. A lightweight DI container handles dependencies. The controller supports request and response filters that can modify the HTTP request or response, and filters are provided for Basic and Oauth authentication.

Download Website Updated 11 Mar 2014 htmLawed

Screenshot
Pop 243.01
Vit 40.62

htmLawed is a PHP script that makes input text more secure, HTML standards-compliant, and suitable in general from the viewpoint of a Web-page administrator, for use in the body of HTML 4 or XHTML 1 or 1.1 documents. It is a customizable HTML/XHTML filter, processor, purifier, and sanitizer. It can ensure that HTML tags are balanced and properly nested tags, neutralize code that may be used for cross-site scripting (XSS) attacks, and limit the allowed HTML elements, tags, attributes, or URL protocols.

No download Website Updated 14 Jan 2014 jcpp

Screenshot
Pop 142.68
Vit 16.63

JCPP is a complete, compliant, standalone, pure Java implementation of the C preprocessor. It is intended to be of use to people writing C-style compilers in Java using tools like sablecc, antlr, JLex, CUP, and so forth. It has been used to successfully preprocess much of the source code of the GNU C library.

Screenshot

Project Spotlight

4DIAC

A framework for distributed industrial automation and control.

Screenshot

Project Spotlight

sql2o

An easy database query library.