Talend Open Studio for Data Quality helps you to profile your data. The ergonomic interface allows you to define metrics (indicators) and collect statistics on your data in a few clicks. It comes with a set of regular expressions that helps you to identify bad data. You can create your own regular expressions and use them in data profiling analyses. A lot of options exist for each indicator, which change the behavior of the indicator so that it gives you more pertinent information. Data quality options on indicators alert you when your data quality is not what you expected.
With MetaModel, you use a type-safe SQL-like API for querying any datastore. It is a data access framework providing a common interface for exploration and querying of different types of datastores. It isn't a data mapping framework. Instead, it emphasizes abstraction of metadata and the ability to add data sources at runtime, making MetaModel great for generic data processing applications, but less so for applications modeled around a particular domain.
mod_musicindex is an Apache module aimed at being a C alternative to the Perl module Apache::MP3. It allows nice displaying of directories containing MP3, Ogg Vorbis, FLAC, or MP4/AAC files, including sorting them on various fields, streaming/downloading them, constructing playlists, and searching. It also provides features such as RSS and Podcast feeds, multiple CSS support, and archive downloads.
DataCleaner is a data quality analysis tool that allows you to perform data profiling, validating, and minor ETL-like tasks. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation. It can be used for master data management (MDM) methodologies, data warehousing projects, statistical research, preparation for extract-transform-load activities, and more.
dbacl is a digramic Bayesian text classifier. Given some text, it calculates the posterior probabilities that the input resembles one of any number of previously learned document collections. It can be used to sort incoming email into arbitrary categories such as spam, work, and play, or simply to distinguish an English text from a French text. It fully supports international character sets, and uses sophisticated statistical models based on the Maximum Entropy Principle.
Evergreen is an integrated library system originally developed by the Georgia PINES consortium for use as their automation system, and now includes contributions from around the world. It was designed from scratch for large-scale deployment in very large public library and state-wide consortium environments with tens of millions of records and hundreds of libraries, but can also scale down to the smallest of single-branch libraries.
The Gaudí Database Visual Editor is a Java application that allows you to visually design the tables of a database using a JDBC 2.0 (or higher) driver. It saves generated diagrams in XML format. It also generates Java code that binds an object to a table from a database and XML code for generating GUIs.
Config-Model provides a framework for editing and validating the content of any configuration file or data. With a configuration model (expressed in a data structure), Config-Model provides a user interface and a tool to validate configuration. An optional graphical (Perl/Tk) or curses interface can be used to edit configuration data that will be validated according to the user-provided model. Config-Model includes a model example for fstab and a small fstab demo.