Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and it is very suitable for working with ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and these rules are simply stored in a text file and applied at runtime.
Oxygen XML Developer is an Oxygen distribution specially tuned for XML development, providing XML editing, XML conversion, XML Schema development, XSLT/ XQuery/ XPath execution and debugging, SOAP and WSDL testing, Native XML and relational database support, and XML instance generation.
Speedpad is a small and portable ncurses-powered tool to test, train, and increase typing speed on arbitrary text input. It is designed for intermediate-to-advanced level typists and assumes that you have already learned how to touch type. It does not use lessons, single words, or other synthetic stuff. It supports tab expansion, auto indentation, and syntax to train on code. It features a reference speed robot and supports CPS, CPM, WPM, PPM, and CPH/KPH metrics. It shows detailed statistics about speed and helps find and eliminate frequent typos. Stats are dumped to standard output in a machine-readable format after completion, and can be piped into gnuplot.
csvgroupby is a small utility program that allows you to obtain aggregated statistical information from comma-separated files containing tabular data. It is similar to the SQL GROUP BY clause. It currently supports the COUNT, MAX, MIN, SUM, and AVG operators. It performs as many processing jobs as possible in a single run through the data file, which means that large data sets can be efficiently processed.
Winnow efficiently trains and operates any number of unique Bayesian (Naive Bayes) classifiers on large sets of content. It has very high performance and works with very small training and unbalanced training sets. It has been used to power an innovative Web feed reader that uses smart tags, which learn and find the content you want to see, from more sources than you can follow with traditional feed readers. It works particularly well with Ruby and Ruby on Rails.
nyu is a combination of modern academic approaches to parsing formal grammars from PEGs and expression grammars that represents the new state of the art in parser generators. nyu grammars are written in a powerful language based on PEGs (parsing expression grammars) but with modifications to allow both the AST and the parser to be specified intuitively in a single grammar. nyu outputs parsers that take advantage of the chilon::parser meta-programming library for C++. The generated parsers are almost as concise and readable as the input grammars, yet perform as well as hand-written C code. nyu ASTs are built using tuples, variant types, and lists, and allow self referential parsers and AST nodes to be manipulated. Advanced features such as hashed containers and grammar inheritance are also possible and well tested. nyu is currently powerful enough to deal with complex grammars and bootstraps its own parser.