Shasplit takes a large data block, splits it into smaller parts, and puts those parts into an SHA-based content-addressed store. Reassembling those parts is a trivial "cat" invocation. Repeating parts (e.g., from previous split operations) are stored only once, which allows efficient incremental backups of whole LVM snapshots via Rsync. Shasplit shows its strengths on encrypted block devices, but might be useful for non-encrypted data, too.
springclean is a command line tool for cleaning up log files. It can select files based on name (exact or regex), age, or a combination of both. You can preview changes, and confirm before running each action. For each action you can find how much disk space has been freed, compress, move to another directory or remove your files, and create an audit trail with syslog.
Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and it is very suitable for working with ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and these rules are simply stored in a text file and applied at runtime.
Xidel is a command line tool to download Web pages and extract data from them. It can download files over HTTP/S connections, follow redirections, links, or extracted values, and process local files. The data can be extracted using XPath 2.0, XQuery 1.0, and JSONiq expressions, CSS 3 selectors, and custom, pattern-matching templates that are like an annotated version of the processed page. The extracted values can then be exported as plain text/XML/HTML/JSON, or assigned to variables to be used in other extract expressions or be exported to the shell. There is also an online CGI service for testing.
Quasi extracts code fragments from text documentation and appends them to new source code files. Unlike other literate programming tools, it does not perform any sort of macro expansion. The strength of Quasi is its simplicity - it forces the programmer to think in turns of properly designed methods.