RSS 36 projects tagged "Text Processing"

Download Website Updated 28 Jul 2012 NCBI C++ Toolkit

Screenshot
Pop 48.58
Vit 5.06

The NCBI C++ Toolkit provides portable libraries and applications for assisting genetic science. These include libraries for networking, SQL and BerkeleyDB access, CGI and HTML handling, ASN.1 and XML handling, sequence alignment engines, sequence retrieval engines, BLAST database engines, FLTK and OpenGL graphics toolkits, and basic system utilities.

No download Website Updated 21 Sep 2010 Figtex2eps

Screenshot
Pop 20.78
Vit 62.94

Figtex2eps is a bash script for automating the process of creating Postscript images (or PDF) with embedded LaTeX symbols and alike made with Xfig.

Download No website Updated 09 Jun 2010 Mac's CMS

Screenshot
Pop 84.88
Vit 6.70

Mac's CMS is a flat file (XML and SQLite) AJAX content management system. It focuses mainly on the "Edit In Place" editing concept. It comes with a built-in blog with moderation support, user manager section, roles manager section, and SEO/SEF URLs.

Download Website Updated 07 Nov 2008 similarity-utils

Screenshot
Pop 31.03
Vit 2.42

similarity-utils is a set of two programs to give a quantitative measure of how similar two files are, on a scale 0 to 1. similarity_by_diff measures the number of difference lines reported by diff(1), while similarity_by_zlib tries compressing the two files, both separately and together, and comparing the results.

Download Website Updated 13 May 2008 yawl

Screenshot
Pop 84.99
Vit 3.52

This is a comprehensive "word game" word list for UNIX/Linux. It is a superset of the author's ENABLE list, the "OSW", and various lists researched by the author's colleague, Alan Beale. At 264,093 words, it is the largest list of its kind, suitable for use in all manners of crossword-type board games and word construction games, as well as for a spell checker dictionary. The YAWL package now includes two anagramming utilities (supplied as source code, handled by the included Makefile). There is also a shell script that extends the UNIX "strings" system command. This is the word list package recommended for the author's Quackey word game.

Download Website Updated 05 Mar 2008 sdml2txt

Screenshot
Pop 24.25
Vit 1.00

sdml2txt is a script to convert SDML to ASCII text UML Sequence Diagrams. SDML is an extremely simplistic UML sequence diagram markup language.

Download Website Updated 03 Aug 2007 Gnosis Utils (Python)

Screenshot
Pop 133.99
Vit 4.85

Gnosis Utils contains several Python modules for XML processing, plus other generally useful tools: xml.pickle (serializes objects to/from XML, API compatible with the standard pickle module), xml.objectify (turns arbitrary XML documents into Python objects), xml.validity (enforces XML validity constraints via DTD or Schema), xml.indexer (full text indexing/searching), and many more.

Download Website Updated 16 May 2007 rss4admins

Screenshot
Pop 26.27
Vit 1.49

rss4admins generates an RSS feed from your (log) files. It is easy to configure, and supports multiple channels (multiple logs).

Download Website Updated 12 Jan 2007 Construct

Screenshot
Pop 50.73
Vit 2.58

Construct is a Python library for declaratively defined data structures, called "constructs". These constructs can both parse data into an object and build an object into data. Constructs handle fields of either byte or bit granularity, structs, unions, sequences, repeaters, adapters, validators, switching, pointers, on-demand (lazy) parsing, and many more. The library defines a large number of primitive constructs, as well a large inventory of file formats and network protocols.

Download Website Updated 29 Nov 2005 atropine

Screenshot
Pop 28.72
Vit 1.42

atropine is a screen-scraping library built on top of BeautifulSoup. It helps programmers make assertions about document structure while getting at the data they are interested in.

Screenshot

Project Spotlight

GMLP

A markup language processor.

Screenshot

Project Spotlight

Mutt Folder List

A mutt patch that adds a sidebar showing all mail folders.