Talend Open Studio for Data Quality helps you to profile your data. The ergonomic interface allows you to define metrics (indicators) and collect statistics on your data in a few clicks. It comes with a set of regular expressions that helps you to identify bad data. You can create your own regular expressions and use them in data profiling analyses. A lot of options exist for each indicator, which change the behavior of the indicator so that it gives you more pertinent information. Data quality options on indicators alert you when your data quality is not what you expected.
With MetaModel, you use a type-safe SQL-like API for querying any datastore. It is a data access framework providing a common interface for exploration and querying of different types of datastores. It isn't a data mapping framework. Instead, it emphasizes abstraction of metadata and the ability to add data sources at runtime, making MetaModel great for generic data processing applications, but less so for applications modeled around a particular domain.
DataCleaner is a data quality analysis tool that allows you to perform data profiling, validating, and minor ETL-like tasks. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation. It can be used for master data management (MDM) methodologies, data warehousing projects, statistical research, preparation for extract-transform-load activities, and more.
Haystack is a powerful tool designed to enable each and every individual manage all of her information in the way that makes the most sense. By removing the arbitrary barriers created by applications that only handle certain information "types", and recording only a fixed set of relationships defined by the developer, users can define whichever arrangements of, connections between, and views of information they find most effective. Such personalization of information management will dramatically improve your ability to find what you need when you need it.
Evergreen is an integrated library system originally developed by the Georgia PINES consortium for use as their automation system, and now includes contributions from around the world. It was designed from scratch for large-scale deployment in very large public library and state-wide consortium environments with tens of millions of records and hundreds of libraries, but can also scale down to the smallest of single-branch libraries.
Wikidbase is a powerful and highly flexible combination of two structural extremes of data management systems: a wiki and a database. As such, it has all of the flexibility of a wiki (e.g. any kind of unstructured data can be stored) and the structural data capabilities of a database (e.g. data may be modelled similarly to database fields, tables, and relations, such that structural reports can me made of that data). The functionality is combined in such a way that this general-purpose data management system may be shaped easily, without the need of a database expert, into a custom data management application such as a contact relation management (CRM) system, a knowledge base, a shared calendar system, a project management system, etc.
Red-Piranha is a search system that can actually learn what you are looking for. It can be used as a Web page, command line, or XML-WebService, so it will work with most languages, including Java, Perl, C#/.NET, and PHP. It includes learning abilities for the Desktop/Internet search functionality. All feedback from the user is stored in (editable) XML and RDF, and is used by the system to improve the quality of searches.