Nutch is highly scalable Web searching software which builds on top of Apache Hadoop and Lucene Java. Key features include a Web crawler, indexer, crawl management tools, parsers for HTML, PDF, DOC, and several other document formats, and an expandable architecture that allows you to plug in additional functionality such as document parsers, custom scoring algorithms, custom content parsers, protocols, and more.
The mission of the Apache Portable Runtime (APR) project is to create and maintain software libraries that provide a predictable and consistent interface to underlying platform- specific implementations. The primary goal is to provide an API to which software developers may code and be assured of predictable if not identical behaviour regardless of the platform on which their software is built, relieving them of the need to code special-case conditions to work around or take advantage of platform-specific deficiencies or features.
Apache Qpid is a messaging broker that implements the latest AMQP specification, providing transaction management, queuing, distribution, security, management, clustering, federation, heterogeneous multi-platform support, and much more. It is extremely fast and aims to be 100% AMQP Compliant.
Daisy is an enterprise content management solution, bridging the gap between classic Web site content management and the Wiki style of information management and discovery. It is ideally suited for intranet knowledge bases, product and/or project documentation, and management of content-rich Web sites. It consists of a repository server with powerful querying and versioning capabilities, and a Wiki-like front-end Web user interface with in-browser rich-text authoring.