Apache OpenNLP is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services.
Apache PhotArk is a photo gallery application including a content repository for the images, a display piece, an access control layer, and upload capabilities. The idea is to have a rigid design for the content repository with a very flexible display piece. The images in the content repository will be protected with granular access control.
The mission of the Apache Portable Runtime (APR) project is to create and maintain software libraries that provide a predictable and consistent interface to underlying platform- specific implementations. The primary goal is to provide an API to which software developers may code and be assured of predictable if not identical behaviour regardless of the platform on which their software is built, relieving them of the need to code special-case conditions to work around or take advantage of platform-specific deficiencies or features.
Apache Qpid is a messaging broker that implements the latest AMQP specification, providing transaction management, queuing, distribution, security, management, clustering, federation, heterogeneous multi-platform support, and much more. It is extremely fast and aims to be 100% AMQP Compliant.
Apache Rivet is a system for creating dynamic Web content via a programming language integrated with Apache Web Server. It is designed to be fast, powerful and extensible, consume few system resources, be easy to learn, and to provide the user with a platform that can also be used for other programming tasks outside the web (GUIs, system administration tasks, text processing, database manipulation, XML, etc.). It is similar to PHP, except that it uses Tcl, and provides both HTML/Tcl pages as well as pure Tcl pages to aid the programmer in separating logic and presentation when necessary.
Solr is an enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g. Word and PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.
Apache SpamAssassin is an extensible email filter that is used to identify spam. Once identified, the mail can then be optionally tagged as spam for later filtering. It provides a command line tool to perform filtering, a client-server system to filter large volumes of mail, and Mail::SpamAssassin, a set of Perl modules allowing Apache SpamAssassin to be used in a wide variety of email systems.
Synapse is an ESB engine and XML router built completely on open standards. It is a mediation framework for XML messages and Web services that allows messages flowing through, into, or out of an organization to be mediated, including aspects such as logging, service lookup, performance mediation, versioning, failover, monitoring, fault management, and tracing.
The Apache Traffic Server (TS or ATS) is a modular, high-performance reverse proxy server, generally comparable to Squid. It was created by Inktomi, and distributed as a commercial product called the Inktomi Traffic Server, before Inktomi was acquired by Yahoo!. Traffic Server has been actively used inside of Yahoo for over 4 years, serving billions of requests every day. As of fall 2009, Traffic Server is an Open Source project, and in April 2010 the Apache Traffic Server was promoted to a top-level project of the ASF.
UIMA SDK is a software architecture and framework for supporting the development, integration, and deployment of search and analysis technologies. It can be used to analyze large volumes of unstructured information (text, audio, video, images, etc.) to discover, organize, and deliver relevant knowledge to the client or application end user.