webStraktor is a programmable World Wide Web data extraction client. It features a scripting language to facilitate the collection, extraction, and storage of information available on the Web, including images. The scripting language uses elements of regular expression and XPath syntax. The standard webStraktor output format is XML based, either in ASCII, UTF-8, or ISO-8859-1 (Latin1). It adheres to the Robots Exclusion Protocol and can be configured to operate anonymously by connecting through proxy servers. Exhaustive logging and tracing information are provided.
Fileevent is a rules-based utility that matches files based on simple patterns and macros and performs actions on them. These actions are typically used to transfer or rename the file ready for further processing. This utility is particularly useful for batch processing environments where files to load/process might arrive on an adhoc basis. Fileevent allows them to be transferred elsewhere, retrieved from elsewhere, or renamed.
BaseX is a light-weight, high-performance, and scalable XML database system and XPath/XQuery processor, including full support for the W3C Update and Full Text extensions. An interactive and user-friendly GUI frontend gives you great insight into large XML data instances. It is platform independent and works out of the box.
LanguageTool is a style and grammar checker that currently supports English, Polish, German, French, Dutch, and other languages to a different degree. It scans the words and their part-of-speech tags for occurrences of error patterns, which are defined in an XML file. More powerful error rules can be written in Java.
GroupServer is a Web-based mailing list manager designed for large sites. It provides email interaction like a traditional mailing list manager but also supports reading, searching, and posting of messages and files via the Web. Users have forum-style profiles, and can manage their email addresses and other settings using the same Web interface. It has supports features such as Atom feeds, a basic CMS, statistics, multiple verified addresses per user, and bounce detection, and is able to be heavily customized.
Xidel is a command line tool to download Web pages and extract data from them. It can download files over HTTP/S connections, follow redirections, links, or extracted values, and process local files. The data can be extracted using XPath 2.0, XQuery 1.0, and JSONiq expressions, CSS 3 selectors, and custom, pattern-matching templates that are like an annotated version of the processed page. The extracted values can then be exported as plain text/XML/HTML/JSON, or assigned to variables to be used in other extract expressions or be exported to the shell. There is also an online CGI service for testing.
Collax Business Server is an all-in-one Linux server for small- and medium-sized businesses. It delivers all the important network services within a heterogeneous business environment for communication, infrastructure, compliance, groupware, and storage, all in a reliable and secure way which is easy to manage. It also provides essential security functions such as firewalling and virus and spam filtering, to protect against hacker attacks, viruses, and unsolicited email messages.
Collax Groupware Suite is a complete collaboration, e-mail, and messaging server with Outlook MAPI support. It offers enterprise email server functions, anti-spam and anti-virus filters, GUI management, a file server for SMB, NFS, FTP, and Apple shares, backup/restore server, IM server, and fax and SMS server. The groupware offers AJAX Web mail, calendar, team calendar, contacts, and tasks, and supports ActiveSync for mobile devices. It is free for private or commercial use of up to five users.