RSS 61 projects tagged "Text Processing"

Download Website Updated 09 Apr 2014 Highlight

Screenshot
Pop 1,038.59
Vit 267.33

Highlight is a universal converter from source code to HTML, XHTML, RTF, TeX, LaTeX, SVG, BBCode, and terminal escape sequences. (X)HTML and SVG output are formatted by Cascading Style Sheets. It supports more than 170 programming languages, and includes 80 highlighting color themes. The configuration files are Lua scripts with plug-in support. The converter includes some features to provide a consistent layout of the output code.

Download Website Updated 07 Apr 2014 Docx to Text Converter (docx2txt)

Screenshot
Pop 190.37
Vit 42.02

docx2txt is a tool that attempts to generate equivalent text files from Microsoft .docx documents, preserving some formatting and document information (which MS text conversion drops) along with appropriate character conversions for a good (ASCII) text experience. It is a platform independent solution consisting of (core) Perl and (wrapper) Unix/Windows shell scripts and a configuration file to control the output text appearance to fair extent. It can very conveniently be used to build a Web based docx document conversion service. Some Makefiles and Windows batch files are provided for easy installation of the scripts. With unzippers like CakeCmd that can deal with corrupt Zip archives, this tool can extract text from corrupt docx documents in many cases, where MS word processor fails to even open them.

Download Website Updated 06 Apr 2014 Vrapper

Screenshot
Pop 302.75
Vit 68.34

Vrapper is an Eclipse plugin which acts as a wrapper for Eclipse text editors to provide a Vim-like input scheme for moving around and editing text. Unlike other plugins which embed Vim in Eclipse, Vrapper imitates the behavior of Vim while still using whatever editor you have opened in the workbench. The goal is to have the comfort and ease which comes with the different modes, complex commands, and count/operator/motion combinations which are the key features behind editing with Vim, while preserving the powerful features of the different Eclipse text editors, like code generation and refactoring.

Download Website Updated 22 Mar 2014 GNU Parallel

Screenshot
Pop 896.63
Vit 66.63

GNU parallel is a shell tool for executing jobs in parallel locally or using remote computers. A job is typically a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. If you use xargs today you will find GNU parallel very easy to use, as GNU parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. GNU parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU parallel as input for other programs.

Download Website Updated 15 Mar 2014 psx

Screenshot
Pop 200.85
Vit 40.11

PSX is a PHP framework for creating RESTful APIs. It helps you to build clean URLs serving Web standard formats like JSON, XML, Atom, and RSS. It includes a handler system that abstracts away SQL queries from domain logic, a routing system that executes correct controller method for the location of the controller and the method annotation, and a flexible data system that converts database records into formats like JSON, XML, Atom, and RSS. A lightweight DI container handles dependencies. The controller supports request and response filters that can modify the HTTP request or response, and filters are provided for Basic and Oauth authentication.

Download Website Updated 04 Mar 2014 Sanzang

Screenshot
Pop 313.22
Vit 10.56

Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and it is very suitable for working with ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and these rules are simply stored in a text file and applied at runtime.

Download Website Updated 28 Feb 2014 DocBook Doclet

Screenshot
Pop 756.12
Vit 103.02

DocBook Doclet is a javadoc doclet that creates DocBook XML and UML class diagrams from Javadoc.

No download Website Updated 08 Oct 2013 pyexpander

Screenshot
Pop 53.90
Vit 5.21

pyexpander is a macro processor based on Python. Instead of simple text replacement, it offers evaluation of arbitrary Python expressions and execution of Python code. It features simple syntax definition: all expander commands start with a dollar sign ("$") followed by word characters, parameters, Python code enclosed in brackets, or a combination of these. The full power of the Python programming language can be used, including all operators, functions and modules. Any Python expression can be used to insert text. It also provides a Python library that you can use to develop other macro tools based on pyexpander.

Download Website Updated 15 Sep 2013 Less

Screenshot
Pop 425.44
Vit 22.80

Less is a program similar to more, i.e. a terminal based program for viewing text files and the output from other programs. Less offers many features beyond those that more does. For instance, it allows backward movement in the files as well as forward.

Download No website Updated 05 Sep 2013 Multibyte Keyword Generator

Screenshot
Pop 33.62
Vit 1.44

Multi-byte Keyword Generator extracts meta keywords from multi-byte text. It is an enhanced version of the "Automatic Keyword Generator" class originally written by Ver Pangonilo. This version provides better word segmentation, the ability to handle multi-byte strings, and support for text in multiple languages.

Screenshot

Project Spotlight

Tor-ramdisk

A micro Linux distribution for securely hosting a Tor server.

Screenshot

Project Spotlight

UMR

An Unreal .umx and .uax class object reader and extractor.