66 projects tagged "Information Management"
The eobjects.org MetaModel is a project created for maximum reuse of a SQL-compliant domain model of the database domain. The MetaModel is a model that contains classes that represent the structure of a database (schemas, tables, column, relationships) and interaction with the database (queries) in a SQL/LINQ-like way. In short, it is a model for modelling data in databases and other datastores. With MetaModel you can query different datastores like databases, CSV files, Excel spreadsheets, MS Access files, and XML files using the same approach and the same domain model.
PDFTextStream is a PDF text and metadata extraction library available for Java and .NET. It supports all versions of the PDF document specification (including v1.7, used by Acrobat 8, 9, and X), extraction of text encoded using double-byte character sets (including Chinese, Japanese, and Korean), decryption of documents encrypted using 40-bit, 128-bit, 256-bit, and variable bit length ciphers, and extraction of all document metadata provided by PDF documents (including form data, bookmarks, and annotations). Easy integration with Jakarta Lucene is included, as well as interactive form update capability.
cpdetector is a small yet clever framework for codepage detection that integrates different strategies. It may be used as a library for third party software that accesses textual data over network. It also includes a best-practice implementation in form of a command line tool that allows sorting and transforming large collections of documents based on their codepage. Available strategies include: jchardet (exclusion, frequency analysis, and guessing), detection of the HTML charset property, and detection of the XML encoding declaration.
StelsCSV is a JDBC driver that allows performing SQL statements and other JDBC operations upon text files (comma separated, tab-separated, fixed length, etc.). Using this driver, users can easily create a simple database consisting of plain text files. The driver can be used for writing data importing programs and migration tools. It supports most keywords of ANSI SQL92, table joins, INSERT, UPDATE, and DELETE statements, data types, aggregate, converting, string, and user-defined SQL functions.
Terrier is software for the rapid development of Web, intranet, and desktop search engines. More generally, it is a modular platform for building large-scale information retrieval applications, providing indexing and probabilistic retrieval functionalities. It comes with a desktop search application.
The Java Application Monitor (JAMon) is a free, simple, high performance, thread safe, Java API that allows developers to easily monitor production applications. JAMon can be used to determine application performance bottlenecks, user/application interactions, and application scalability. JAMon gathers summary statistics such as hits, execution times (total, average, minimum, maximum, standard deviation), and simultaneous application requests. JAMon statistics are displayed in the sortable JAMon report.
EntityFS is an object-oriented file system API for Java. It has a rich set of powerful file and directory manipulation tools that makes it much easier to work with file system entities from Java. The file and directory interfaces are implementation-independent, and there is support for building file systems on disk, in memory, or in Zip or Jar files. File systems can also be configured to support capabilities such as file data compression or metadata.
Berkeley DB XML is a native XML database engine for use within your product. Made available as a C++ library with language bindings for Java, Perl, Python, PHP, and Tcl, it integrates directly into your application (it is not a standalone database server). It provides XQuery access into a database of document containers. XML documents are stored and indexed in their native format using Berkeley DB as the transactional database engine.
[fleXive] is a Java EE 5 content repository aiming to support upcoming industry standards like CMIS. It strives to provide a holistic approach by offering a comprehensive set of tools and building blocks for building content-centric Web applications around a [fleXive] content repository. It speeds up development by easing many tedious and repetitive programming tasks and helping to keep your application(s) flexible during the development cycle and in production. It concentrates on enterprise-scale content modeling, storage, and retrieval, and includes comprehensive JSF support for displaying and manipulating these contents in (Web) applications. Key features include persistence, security, versioning, multi-language support, and scripting.
[fleXive] CMS is a Java EE content management system based on JavaServer Faces 1.2. It combines the power of JSF XHTML templating with that of the Java EE 5 content repository, [fleXive]. Some highlights include dynamic JSF templating (Facelets), easy integration of custom logic with EJB or JSF beans, a modular structure, Maven support, generic data structures, and WebDAV and CMIS support. It incorporates all core [fleXive] features like security, versioning, multilinguism, and scripting.