Nutch is highly scalable Web searching software which builds on top of Apache Hadoop and Lucene Java. Key features include a Web crawler, indexer, crawl management tools, parsers for HTML, PDF, DOC, and several other document formats, and an expandable architecture that allows you to plug in additional functionality such as document parsers, custom scoring algorithms, custom content parsers, protocols, and more.
Apache Rivet is a system for creating dynamic Web content via a programming language integrated with Apache Web Server. It is designed to be fast, powerful and extensible, consume few system resources, be easy to learn, and to provide the user with a platform that can also be used for other programming tasks outside the web (GUIs, system administration tasks, text processing, database manipulation, XML, etc.). It is similar to PHP, except that it uses Tcl, and provides both HTML/Tcl pages as well as pure Tcl pages to aid the programmer in separating logic and presentation when necessary.
Synapse is an ESB engine and XML router built completely on open standards. It is a mediation framework for XML messages and Web services that allows messages flowing through, into, or out of an organization to be mediated, including aspects such as logging, service lookup, performance mediation, versioning, failover, monitoring, fault management, and tracing.
ApacheDS is an LDAP and X.500 experimentation platform. Its backend subsystem and frontend are separable and independently embeddable. It provides a server side JNDI LDAP provider that directly interacts with the backend storage. It is powered by SEDA (Staged Event-Driven Architecture), which can handle large amounts of concurrency.
Aranea is a hierarchical Model-View-Controller Web framework that provides a common, simple approach to building Web application components, reusing custom or general GUI logic, and extending the framework. The framework enforces programming using object oriented techniques with POJOs and provides a JSP tag library that facilitates the programming of Web GUIs without writing HTML. In addition to being a full-fledged Web framework in its own right, it provides a powerful and simple component system that allows the framework to be tailored by configuring the reusable modules and adding modules only for the missing features.
AtMail is a webmail client. The project aims to provide an elegant client for existing IMAP mailservers, with less bloat and a focus on an intuitive, simple user interface. Features include complete Webmail functionality, address-book support, video mail, an AJAX interface, drag'n'drop, and more.
Auth MemCookie is an Apache v2 authentication and authorization module based on a cookie authentication mechanism. The module doesn't do authentication by itself, but verifies if the cookie used for authentication is valid for each URL protected by the module. The module also validates whether the authenticated user has authorization to access the URL. Authentication is done externally through an authentication form page, and all authentication information necessary to the module is a stored in memcached.