DataCleaner is a data quality analysis tool that allows you to perform data profiling, validating, and minor ETL-like tasks. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation. It can be used for master data management (MDM) methodologies, data warehousing projects, statistical research, preparation for extract-transform-load activities, and more.
The Gaudí Database Visual Editor is a Java application that allows you to visually design the tables of a database using a JDBC 2.0 (or higher) driver. It saves generated diagrams in XML format. It also generates Java code that binds an object to a table from a database and XML code for generating GUIs.
The Informa RSS Library provides a convenient Java API for handling news channels and metadata about them. Different syntax formats (like RSS 0.91, 1.0 [RDF], 2.0, and Atom 0.3) for feeds are supported. There is also basic support for channel information descriptions (OPML). A full-text engine (Lucene) can be used for indexing and searching the news items. Two backends for storing the data are currently provided: In-Memory and Hibernate (which allows you to persist news items into allmost any JDBC compliant database).
K.E.T.T.L.E (Kettle ETTL Environment) is a meta- data driven ETL (Extraction, Transformation, Transportation, and Loading) tool. This means that no code has to be written to perform complex data transformations. It is possible to create plugins to do custom transformations or access proprietary data sources. Kettle supports most databases on the market, and has native support for slowly changing dimensions on most database platforms.
Kraken is an application for managing knowledge objects, which can be documents, remote or locally cached Web pages, personal information, todo list items, appointments, and so on. It is especially useful for researchers or students to manage their information. Users can annotate these knowledge objects with metadata, perform complex queries, and present the results as HTML pages. Kraken uses RDF as its native format, allowing its data to be easily read by external applications.
The OMCSNet-WordNet project aims to improve the quality of the OMCSNet dataset by using automated processes to map WordNet synonym sets to OMCSNet concepts and import additional semantic linkage data from WordNet. It is based on OMCSNet 1.2, a semantic network and inference toolkit written in Python/Java. OMCSNet currently contains over 280,000 separate pieces of common sense information extracted from the raw OMCS dataset. This project is also based on WordNet, an online lexical reference system that in recent years has become a popular tool for AI researchers.
OpenMKS is a tool for searching and navigating multimedia collections of images, video, 3D models, and textual descriptions. It allows you to semantically integrate content from various sources and then expose the result using standards so it can be explored by users through their Web browsers or by other software applications over the Web as a servlet or Web Service.