958 projects tagged "Text Processing"
TCPDF is a PHP class for generating PDF documents without requiring external extensions. TCPDF supports all ISO page formats and custom page formats, custom margins and units of measure, UTF-8 Unicode, RTL languages, HTML, barcodes, TrueTypeUnicode, TrueType, OpenType, Type1, and CID-0 fonts, images, graphic functions, clipping, bookmarks, JavaScript, forms, page compression, digital signatures, and encryption.
Highlight is a universal converter from source code to HTML, XHTML, RTF, TeX, LaTeX, SVG, BBCode, and terminal escape sequences. (X)HTML and SVG output are formatted by Cascading Style Sheets. It supports more than 170 programming languages, and includes 80 highlighting color themes. The configuration files are Lua scripts with plug-in support. The converter includes some features to provide a consistent layout of the output code.
John the Ripper is a fast password cracker, currently available for many flavors of Unix, Windows, DOS, BeOS, and OpenVMS. Its primary purpose is to detect weak Unix passwords. It supports several crypt(3) password hash types commonly found on Unix systems, as well as Windows LM hashes. On top of this, lots of other hashes and ciphers are added in the community-enhanced version (-jumbo), and some are added in John the Ripper Pro.
Isearch is software for indexing and searching text documents. It supports full text and field based search, relevance ranked results, Boolean queries, and heterogeneous databases. It can parse many kinds of documents "out of the box," including HTML, mail folders, list digests, SGML-style tagged data, and USMARC. It can be extended to support other formats by creating descendant classes in C++ that define the document structure. It is pretty easy to customize in this way, provided that you know some C++ (and you will need to ftp the source code). A CGI interface is also included for Web based searching.