C-ICAP Classify is a module that allows classification (labeling) of Web pages, images, and soon video based on content. Labels are placed in HTTP headers. Any PIC-Label META tags are exported into HTTP headers. This allows for creation of very flexible filters according to rules defined by the user, using the ICAP enabled proxy's ACLs. It is not a URL filter, so implementing it with sslBump or similar proxy technologies makes it very difficult to bypass. Text classification is done using Fast Hyperspace (based on Hyperspace from CRM114) and/or a Fast Naive Bayes. Image and video (when implemented) use haar feature detection from the OpenCV library.
Lazygal is another static Web gallery generator. It is command line based, uses a reusable engine and is lazy, meaning that it regenerates only parts that have to be regenerated. There is support for many interesting features like subgalleries, EXIF information, theming, and custom folder meta data. Included themes are pure XHTML and CSS.
wview is an application that controls a supported weather station to retrieve archive records and current conditions. Archive records may optionally be stored in a relational database (MySQL or PostgreSQL). At a user-defined interval, wview will use the archive history and current conditions to generate weather images (buckets and graphs) and Web pages based on configurable HTML templates. It supports serial and USB data loggers, as well as connectivity with a terminal server or serial server via TCP sockets.
Terrier is software for the rapid development of Web, intranet, and desktop search engines. More generally, it is a modular platform for building large-scale information retrieval applications, providing indexing and probabilistic retrieval functionalities. It comes with a desktop search application.
GNU Wget is a utility for noninteractive download of files from the Web. It supports HTTP and FTP protocols, as well as retrieval through HTTP proxies. It can follow HTML links, download many pages, and convert the links for local viewing. It can also mirror FTP hierarchies or only those files that have changed. Wget has been designed for robustness over slow network connections; if a download fails due to a network problem, it will keep retrying until the whole file has been retrieved.