RSS 46 projects tagged "Text Processing"

Download Website Updated 07 Apr 2014 Docx to Text Converter (docx2txt)

Screenshot
Pop 199.64
Vit 36.61

docx2txt is a tool that attempts to generate equivalent text files from Microsoft .docx documents, preserving some formatting and document information (which MS text conversion drops) along with appropriate character conversions for a good (ASCII) text experience. It is a platform independent solution consisting of (core) Perl and (wrapper) Unix/Windows shell scripts and a configuration file to control the output text appearance to fair extent. It can very conveniently be used to build a Web based docx document conversion service. Some Makefiles and Windows batch files are provided for easy installation of the scripts. With unzippers like CakeCmd that can deal with corrupt Zip archives, this tool can extract text from corrupt docx documents in many cases, where MS word processor fails to even open them.

Download Website Updated 30 May 2013 John the Ripper

Screenshot
Pop 1,497.35
Vit 26.96

John the Ripper is a fast password cracker, currently available for many flavors of Unix, Windows, DOS, BeOS, and OpenVMS. Its primary purpose is to detect weak Unix passwords. It supports several crypt(3) password hash types commonly found on Unix systems, as well as Windows LM hashes. On top of this, lots of other hashes and ciphers are added in the community-enhanced version (-jumbo), and some are added in John the Ripper Pro.

Download Website Updated 04 Feb 2013 lesspipe.sh

Screenshot
Pop 206.16
Vit 17.50

lesspipe.sh is an input filter for the pager less as described in less's man page. The script runs under a ksh-compatible shell (e.g. bash, zsh) and allows you to use less to view files with binary content, compressed files, archives, and files contained in archives. Viewing files by accessing a device file is implemented to some extent. It supports many formats (both as plain and compressed files using gzip, bzip2, and other pack programs). Syntax highlighting of source code is possible through an included script, "code2color", or an external program (pygmentize).

Download Website Updated 05 Oct 2011 Vee

Screenshot
Pop 26.74
Vit 1.74

Vee is a command-line blog tool that is very portable across Unix systems. It provides an interactive as well as a batch interface to maintain a log of entries. Formatting is done using a module architecture that allows a high degree of customization. There are minimal flags and no set up is required.

Download Website Updated 15 Nov 2010 PDFjam

Screenshot
Pop 197.02
Vit 6.82

PDFjam is a small collection of shell scripts that provide a simple interface to some of the functionality of the pdfpages package for pdfLaTeX. Facilities include n-up imposition, page rotation, reflection and trimming, selection of pages, and many more. PDFjam depends on a working installation of (pdf)LaTeX, complete with the pdfpages package. For Mac OS X, some example applications (droplets) are provided for drag-and-drop access.

No download Website Updated 21 Sep 2010 Figtex2eps

Screenshot
Pop 21.26
Vit 62.91

Figtex2eps is a bash script for automating the process of creating Postscript images (or PDF) with embedded LaTeX symbols and alike made with Xfig.

No download No website Updated 22 Jul 2010 uWiki

Screenshot
Pop 29.53
Vit 1.00

uWiki is a minimalistic wiki engine. All actions are implemented in external scripts. These scripts are wikified, and thus the wiki is extensible by itself. All dynamic access is protected through ACLs. Wiki content and Web content can be mixed in the same directory hierarchy. Markup engines and revision control are plugin-able. Currently, asciidoc as the markup engine and git as the revision control backend are provided. Subdirectories can form independent sub-wikis with own revision control. Features like distributed pages that syncronize between wikis, spam protection, and batch jobs to schedule mirroring of other content (bittorrent, git, rsync, and wget) are in planning.

Download Website Updated 09 Nov 2009 AutoGen

Screenshot
Pop 278.54
Vit 13.69

AutoGen is a tool designed for generating program files that contain repetitive text with varied substitutions. Its goal is to simplify the maintenance of programs that contain large amounts of repetitious text. This is especially valuable if there are several blocks of such text that must be kept synchronized. Output is specified with a Scheme-enhanced output template. Input, if required by your template, may come from AutoGen definitions, CGI data, or XML files.

Download Website Updated 17 Oct 2008 CRUSH

Screenshot
Pop 37.87
Vit 1.00

CRUSH (Custom Reporting Utilities for SHell) is a collection of tools for processing delimited-text data from the command line or in shell scripts. It provides utilities for aggregating, merging, filtering, and formatting your data.

Download Website Updated 21 Aug 2008 smupcheck

Screenshot
Pop 12.61
Vit 1.00

smupcheck, which stands for Smart Update Checker, checks Web sites for updates automatically, even if they don't offer an RSS feed. It is a very basic tool, and does not offer advanced features such as checking password-protected Web sites, highlighting changes, or filtering results.

Screenshot

Project Spotlight

Possum

A point of sales system for untamed managers.

Screenshot

Project Spotlight

Text Fiction

A Z-Machine for Android.