Haystack is a powerful tool designed to enable each and every individual manage all of her information in the way that makes the most sense. By removing the arbitrary barriers created by applications that only handle certain information "types", and recording only a fixed set of relationships defined by the developer, users can define whichever arrangements of, connections between, and views of information they find most effective. Such personalization of information management will dramatically improve your ability to find what you need when you need it.
Red-Piranha is a search system that can actually learn what you are looking for. It can be used as a Web page, command line, or XML-WebService, so it will work with most languages, including Java, Perl, C#/.NET, and PHP. It includes learning abilities for the Desktop/Internet search functionality. All feedback from the user is stored in (editable) XML and RDF, and is used by the system to improve the quality of searches.
Nabu is a simple framework that extracts chunks of various types of information from documents written in simple text files (written with reStructuredText conventions, parsed with docutils) and that stores this information (including the document) in a remote database for later retrieval. The processing and extraction of the document is handled on a server, and there is a small and simple client that is used to push the files to the server for processing and storage. The client requires only Python to work. The presentation layer is left unspecified: you can use whichever Web application framework you like.
Evergreen is an integrated library system originally developed by the Georgia PINES consortium for use as their automation system, and now includes contributions from around the world. It was designed from scratch for large-scale deployment in very large public library and state-wide consortium environments with tens of millions of records and hundreds of libraries, but can also scale down to the smallest of single-branch libraries.
Wikidbase is a powerful and highly flexible combination of two structural extremes of data management systems: a wiki and a database. As such, it has all of the flexibility of a wiki (e.g. any kind of unstructured data can be stored) and the structural data capabilities of a database (e.g. data may be modelled similarly to database fields, tables, and relations, such that structural reports can me made of that data). The functionality is combined in such a way that this general-purpose data management system may be shaped easily, without the need of a database expert, into a custom data management application such as a contact relation management (CRM) system, a knowledge base, a shared calendar system, a project management system, etc.
JumpBox for Alfresco CMS is a JumpBox virtual appliance for the Alfresco content management system. Alfresco is an enterprise content management (ECM) system that provides features for document management, collaboration, records management, knowledge management, Web content management, and imaging. Alfresco has the goal of providing an ECM system that surpasses commercial ECMs in terms of features, functionality, and benefits to the user community.
DataCleaner is a data quality analysis tool that allows you to perform data profiling, validating, and minor ETL-like tasks. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation. It can be used for master data management (MDM) methodologies, data warehousing projects, statistical research, preparation for extract-transform-load activities, and more.
Talend Open Studio for Data Quality helps you to profile your data. The ergonomic interface allows you to define metrics (indicators) and collect statistics on your data in a few clicks. It comes with a set of regular expressions that helps you to identify bad data. You can create your own regular expressions and use them in data profiling analyses. A lot of options exist for each indicator, which change the behavior of the indicator so that it gives you more pertinent information. Data quality options on indicators alert you when your data quality is not what you expected.