Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and it is very suitable for working with ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and these rules are simply stored in a text file and applied at runtime.
csvgroupby is a small utility program that allows you to obtain aggregated statistical information from comma-separated files containing tabular data. It is similar to the SQL GROUP BY clause. It currently supports the COUNT, MAX, MIN, SUM, and AVG operators. It performs as many processing jobs as possible in a single run through the data file, which means that large data sets can be efficiently processed.
libcsv_parser++ is a C++ library for parsing text files to extract records and fields. The records can be delimited with any set of characters. It makes the following assumptions: the record terminator is only one character in length; the field terminator is only one character in length; and the fields are enclosed by single characters, if any. The parser can handle documents where fields are always enclosed, not enclosed at all, or optionally enclosed. When fields are strictly all enclosed, there is an assumption that any enclosure characters within the field are escaped by placing a backslash in front of the enclosure character. The software could be ported to Windows with very little effort.
SWEC is a program that automates testing of dynamic Web sites. It parses each HTML file it finds for links, and if those links are within the site specified, it will check that page as well. In addition to parsing and locating links, it will also parse the pages looking for known errors and report those. It will report if a page cannot be read (by either returning a 404, 500, or similar).
libalinga-java is a Java native interface to libalinga. It provides C++ and Java wrappers, as well as Java classes. It also provides control files to generate them from the libalinga interface using the program swig. The major and minor versions of this JNI will track that of libalinga, but its bugfixes are independent of libalinga bugfixes.
libalinga is a C++ implementation of a multi-stream codec for the ALingA (Aligned Linguistic Annotation) format. It makes use of libogg++. Each ALingA stream holds at least one stream of annotation data, which is in the LingA format. It may also interleave the signal stream(s) against which the LingA streams are aligned, or it may simply reference such streams. It also provides metadata about the underlying manifold for the signals and the annotations. The metadata is ordered for runtime parsing of the number and type of signal and LingA codecs to enable decoding of the multiple logical streams in one pass.
Augeas is a configuration API and editing tool. It parses common configuration files like /etc/hosts or /etc/grub.conf in their native formats and transforms them into a tree. Configuration changes are made by manipulating this tree and saving it back into native configuration files.