Tulip is an information visualization framework dedicated to the analysis and visualization of relational data. Tulip aims to provide the developer with a complete library, supporting the design of interactive information visualization applications for relational data that can be tailored to the problems being faced. Written in C++, the framework enables the development of algorithms, visual encodings, interaction techniques, data models, and domain-specific visualizations. One of the goals of Tulip is to facilitate the reuse of components, and it allows developers to focus on programming their application. This development pipeline makes the framework efficient for research prototyping as well as the development of end-user applications. The framework also provides a complete software for visual analysis of relational data having attributes.
PDFTextStream is a PDF text and metadata extraction library available for Java and .NET. It supports all versions of the PDF document specification (including v1.7, used by Acrobat 8, 9, and X), extraction of text encoded using double-byte character sets (including Chinese, Japanese, and Korean), decryption of documents encrypted using 40-bit, 128-bit, 256-bit, and variable bit length ciphers, and extraction of all document metadata provided by PDF documents (including form data, bookmarks, and annotations). Easy integration with Jakarta Lucene is included, as well as interactive form update capability.
Xapian is a search engine library, scalable to collections containing hundreds of millions of documents. It's written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C#, Ruby, and Lua. It is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also a rich set of boolean query operators. Omega is a Web search application built upon the Xapian library. It can index a Web server's document tree (including HTML, PDF, OpenOffice, MS Word/Excel/Powerpoint/Works, WordPerfect, RTF, PS, etc.), or data exported from arbitrary sources (e.g. SQL databases).
QOF (Query Object Framework), provides a set of C Language utilities for performing generic structured complex queries on a set of data held by a set of C/C++ objects. This framework is unique in that it does not require SQL or any database at all to perform the query. Thus, it allows programmers to add query support to their applications without having to hook into an SQL database.
Search::Xapian is a Perl XS frontend to the Xapian C++ search library. It is a fairly complete wrapper: most features of the Xapian library are made available for use from Perl. Xapian is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model as well as a rich set of boolean query operators. It's fast and scalable to hundreds of millions of documents.
Berkeley DB XML is a native XML database engine for use within your product. Made available as a C++ library with language bindings for Java, Perl, Python, PHP, and Tcl, it integrates directly into your application (it is not a standalone database server). It provides XQuery access into a database of document containers. XML documents are stored and indexed in their native format using Berkeley DB as the transactional database engine.
ExifTool is a platform-independent Perl library plus a command line application for reading, writing, and editing meta-information in image, audio, and video files. It supports many different types of metadata including EXIF, GPS, IPTC, XMP, JFIF, GeoTIFF, ICC Profile, Photoshop IRB, FlashPix, AFCP, and ID3, as well as the maker notes of many digital cameras.
K.E.T.T.L.E (Kettle ETTL Environment) is a meta- data driven ETL (Extraction, Transformation, Transportation, and Loading) tool. This means that no code has to be written to perform complex data transformations. It is possible to create plugins to do custom transformations or access proprietary data sources. Kettle supports most databases on the market, and has native support for slowly changing dimensions on most database platforms.