oed2dict is a set of scripts that convert the full Oxford English Dictionary v2 (CD-ROM version 3.0) into the jargon and DICT file formats. These can then be consulted with any DICT compatible dictionary lookup program. The resulting DICT files are much faster to use when compared with the proprietary program from the original distribution, and are significantly smaller, with the whole database only weighing around 190MiB.
Shasplit takes a large data block, splits it into smaller parts, and puts those parts into an SHA-based content-addressed store. Reassembling those parts is a trivial "cat" invocation. Repeating parts (e.g., from previous split operations) are stored only once, which allows efficient incremental backups of whole LVM snapshots via Rsync. Shasplit shows its strengths on encrypted block devices, but might be useful for non-encrypted data, too.
getxbook is a collection of tools to download books from websites. There are tools to download from Google Books' "book preview", Amazon's "look inside the book", and Barnes and Noble's "book viewer". There is an optional GUI written in Tcl/Tk, and some shell scripts using OCR to create plain text or searchable PDFs and DjVu files from the downloaded books.
s6-portable-utils is a set of tiny general Unix utilities, often performing well-known tasks such as cut and grep, but optimized for simplicity and small size. They were designed for embedded systems and other constrained environments, but work everywhere. Other sets of small utilities are usually system-specific; for instance, the (otherwise excellent) BusyBox project only works on Linux.