Xapian is a search engine library, scalable to collections containing hundreds of millions of documents. It's written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C#, Ruby, and Lua. It is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also a rich set of boolean query operators. Omega is a Web search application built upon the Xapian library. It can index a Web server's document tree (including HTML, PDF, OpenOffice, MS Word/Excel/Powerpoint/Works, WordPerfect, RTF, PS, etc.), or data exported from arbitrary sources (e.g. SQL databases).
Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and it is very suitable for working with ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and these rules are simply stored in a text file and applied at runtime.
SiSU (Structured information, Serialized Units) is a lightweight markup based, text structuring and publishing framework (that features granular search). With minimal markup of a plaintext file, it produces: plain-text, HTML, XHTML, XML, ODF, LaTeX, PDF, and populates an SQL database at an object/paragraph level for granular searches. Prepare documents using your text editor of choice, then use SiSU to generate the desired output formats. SiSU is controlled from the command line.
glark offers grep-like searching of text files, with very powerful, complex regular expressions (e.g., "/foo\w+/ and /bar[^\d]*baz$/ within 4 lines of each other"). It also highlights the matches, displays context (preceding and succeeding lines), does case-insensitive matches, and automatic exclusion of non-text files. It supports most options from the GNU version of grep.
acoc is a regular-expression based colour formatter for programs that display output on the command-line. It works as a wrapper around the target program, executing it and capturing the stdout stream. Optionally, stderr can be redirected to stdout, so that it, too, can be manipulated. acoc then applies matching rules to patterns in the output and applies colours to those matches.
ZenWeb is a system for building entire Web sites, not just pages. It allows you to focus on the content and the structure of the website, while leaving page construction, markup, layout, and navigation as secondary concerns. It provides tools for complete Web site design and creation, simple paragraph to HTML generation with embellishments, and a rich set of tools for page and Web site creation, modification, and customization.
newfile generates "starting-out" files using a full featured template preprocessor. It can also generate trees of files, for example, a FreeBSD port or a project using automake and autoconf. A user can add their own template files and directories to those supplied with the package. It includes templates for making "empty" files for Ruby, make, shell, C, C++, C & C++ headers, and more.