PDFTextStream is a PDF text and metadata extraction library available for Java and .NET. It supports all versions of the PDF document specification (including v1.7, used by Acrobat 8, 9, and X), extraction of text encoded using double-byte character sets (including Chinese, Japanese, and Korean), decryption of documents encrypted using 40-bit, 128-bit, 256-bit, and variable bit length ciphers, and extraction of all document metadata provided by PDF documents (including form data, bookmarks, and annotations). Easy integration with Jakarta Lucene is included, as well as interactive form update capability.
hiberlite provides C++ object-relational mapping for SQLite 3. Its design and API are inspired by the Boost.Serialization, which means there is almost no API to learn. In contrast to most serialization libraries with SQL serializers, C++ objects mapped with hiberlite behave similar to active record patterns: you are not forced to follow the "read all your data/modify some small part/write everything back" path. It is for people who need reliable data storage, ACID transactions, and simple random access to their data files, and don't like coding in SQL.
Berkeley DB XML is a native XML database engine for use within your product. Made available as a C++ library with language bindings for Java, Perl, Python, PHP, and Tcl, it integrates directly into your application (it is not a standalone database server). It provides XQuery access into a database of document containers. XML documents are stored and indexed in their native format using Berkeley DB as the transactional database engine.
Monotone::AutomateStdio is a Perl library module for accessing Monotone's automate stdio interface. Monotone is a distributed, change-set based SCM system. It has a mode where commands can be sent to it via STDIN and output read from it via STDOUT. Monontone::AutomateStdio makes use of this facility to provide the Perl programmer with a programmatic interface to Monotone.
JavaGit is a Java API that provides access to git repositories. The goal is to bring the power of git to the Java world as an API that is intuitive for developers new to git and developers who are veteran git users. It is engineered to provide the developer with access to the raw git commands through a command API as well as an object API designed to represent the .git repository, the working tree, and other familiar git concepts. JavaGit uses the git binaries installed on the host machine to provide git functionality, and has been designed to easily accommodate additional methods of access to git repositories.
PottyMouth transforms completely unstructured and untrusted text to valid, nice-looking, completely safe XHTML. PottyMouth is designed to handle input text from non-technical, potentially careless, or malicious users. It produces HTML that is completely safe, programmatically and visually, to include on any Web page. You don't need to make your users read any instructions before they start typing. They don't even need to know that PottyMouth is being used.
Nuxeo Core is an embeddable document management core, based on Nuxeo Runtime. It provides all the necessary low-level services to define, store, manage, audit, request, and search content. It is the "kernel" of Nuxeo 5 and can also be embedded in third-party applications to provide advanced content management features. It can run on any Java 5 platform, can be easily extended using plug-ins, and implements JCA (Java EE Connector Architecture) to be easily plugged into existing applications or information systems.