Editorial: Preserving the Information Ecosystem There is a whitepaper, The Christmas Document (http://www.cs.sunysb.edu/~maxim/Xmas/). It is also "lengthly" and "seems to be prett...
A fast closed caption (subtitles) extractor.
A Perl module for lexical analysis/parsing of text files.