Algraeph is a tool for manual alignment of linguistic graphs, such as phrase structure trees or dependency structures, where each node corresponds to a subsequence of the analyzed input sentence. It allows you to express the similarity between two graphs by aligning their nodes and attaching relation labels to these alignments. Graphs are read from one or more graphbanks (or treebanks) in the GraphML or Alpino formats. Alignment relations are user-defined and are stored in a simple XML format, which can be used for further processing. The resulting parallel graph corpus is a useful data set for many tasks in computational linguistics and natural language processing.
xml-test checks that an XML document is included in another document. It is handy when testing an application's output against a document where element order is different (GData and Atom are examples of specifications where element order is unimportant). It has a relaxed notion of containment: element order is ignored, whitespace is trimmed, comments are ignored, specific elements can be ignored by passing XPath-like paths on the command line, and text nodes (element and attribute content) can be ignored by passing '-notext' on the command line.
libhyphenate is a library that provides an implementation of Frank Liangs hyphenation algorithm, better known as the TeX hyphenation algorithm, for C++. It is similar to the libhnj implementation, but, in contrast to libhnj, actually works reliably and is well-documented. It has been tested for English and German. The implementation fully works in and with UTF-8.
XHTML Hyphenator is a filter for XHTML documents that will insert soft hyphens at the proper hyphenation points. It will hyphenate all words in text nodes that are children of nodes in the XHTML namespace and not part of the header (or, long story short, the text part of an XHTML document and nothing else). The hyphenation pattern used is derived from the xml:lang attribute governing the text. Soft hyphens are displayed correctly by most modern browsers and not at all by Mozilla Firefox, so you won't break anything by inserting hyphens, and it improves the text appearance considerably.
urlwatch is a script intended to help you watch URLs and get notified (via email) of any changes. The change notification will include the URL that has changed and a unified diff of what has changed. The script works out of a single directory, so there is no need to install anything. State files are kept in the same folder. The script supports stripping parts of a page that are always changing through the use of a filter hook function. It is typically run as a cronjob.
xtee (cross-tee/expanded tee) is a program for building complex pipelines. It resembles the tee command, except that instead of copying stdin to stdout, it copies an input file to stdout and stdin to an output file. You can use xtee for building things like a bidirectional HTTP filter (using netcat and sed).