Solr is an enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g. Word and PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.
Apache SpamAssassin is an extensible email filter that is used to identify spam. Once identified, the mail can then be optionally tagged as spam for later filtering. It provides a command line tool to perform filtering, a client-server system to filter large volumes of mail, and Mail::SpamAssassin, a set of Perl modules allowing Apache SpamAssassin to be used in a wide variety of email systems.
Synapse is an ESB engine and XML router built completely on open standards. It is a mediation framework for XML messages and Web services that allows messages flowing through, into, or out of an organization to be mediated, including aspects such as logging, service lookup, performance mediation, versioning, failover, monitoring, fault management, and tracing.
The Apache Traffic Server (TS or ATS) is a modular, high-performance reverse proxy server, generally comparable to Squid. It was created by Inktomi, and distributed as a commercial product called the Inktomi Traffic Server, before Inktomi was acquired by Yahoo!. Traffic Server has been actively used inside of Yahoo for over 4 years, serving billions of requests every day. As of fall 2009, Traffic Server is an Open Source project, and in April 2010 the Apache Traffic Server was promoted to a top-level project of the ASF.
UIMA SDK is a software architecture and framework for supporting the development, integration, and deployment of search and analysis technologies. It can be used to analyze large volumes of unstructured information (text, audio, video, images, etc.) to discover, organize, and deliver relevant knowledge to the client or application end user.
Apache UIMA DUCC (Distributed UIMA Cluster Computing) is a cluster management system providing tooling, management, and scheduling facilities that automate the scale-out of applications written using the UIMA framework. Core UIMA provides a generalized framework for applications that process unstructured information such as human language, but does not provide a scale-out mechanism. UIMA-AS extends UIMA and provides a scale-out mechanism for distributing UIMA pipelines over a cluster of computing resources, but does not provide job or cluster management of the resources. DUCC extends UIMA-AS by defining a formal job model that closely maps to a standard UIMA pipeline. Around this job model DUCC provides cluster management services to automate the scale-out of UIMA pipelines over computing clusters.
Apache XML Graphics Commons is a library that consists of several reusable components used by Apache Batik and Apache FOP. Many of these components can easily be used separately outside the domains of SVG and XSL-FO. You will find components such as a PDF library, an RTF library, Graphics2D implementations that let you generate PDF and PostScript files, and much more.
Apache uimaFIT provides Java annotations for describing UIMA components which can be used to directly describe the UIMA components in Java code without the need for traditional UIMA XML descriptors. This greatly simplifies refactoring a component definition (e.g., changing a configuration parameter name). It also makes it easy to instantiate UIMA components without using XML descriptor files by providing convenient factory methods. It is ideal for testing UIMA components because the component can be easily instantiated and invoked without requiring a descriptor file to be created first.
ApacheDS is an LDAP and X.500 experimentation platform. Its backend subsystem and frontend are separable and independently embeddable. It provides a server side JNDI LDAP provider that directly interacts with the backend storage. It is powered by SEDA (Staged Event-Driven Architecture), which can handle large amounts of concurrency.