Release Notes: Quick filtering of datastores was added. Reference data for countries is now provided. Minor UI improvements were made. Support was added for adding extension packages. A command line interface for executing jobs was added. Number formatting options were added in the "Convert to Number" transformer.
Release Notes: Window management was simplified by making most operations available through the single job builder window. Jobs are now stoppable before they have finished. Bar and line charts have been added to a lot of analyzer results. Preview data now contains paging controls to browse further into the data. Most common database drivers are included by default. Various minor improvements and bugfixes were made.
Release Notes: First-time ease of use was improved by disabling all buttons before source data is selected. When possible in a job, filters now have the ability to optimize the query of a job. This was implemented for the "Max rows", "Equals", and "Not null" filters. The visualization of execution flow now allows removing column items and filter outcome items, making the graph more comprehensible, especially for very large jobs. A bug was fixed when passing null values to the the email standardizer. "Mixed" tokens are properly presented in the the Pattern finder.
Release Notes: Minor bugfixes and improvements were made. Filter outcomes were added to the flow visualization. A bug was fixed in the widget for selecting the tokenizer's tokens. The "Equals" filter can now have multiple values with which to compare. Some minor cosmetic improvements were made.
Release Notes: Data transformations can be used to preprocess, extract, refine, combine, and calculate data items in jobs. Filtering, sampling, and subflow management allow you to define criteria that exclude and include particular items of data. Reporting was enriched with charts, graphs, navigation trees, etc. New DQ functions were added for date gap analysis, phonetic similarity finding, synonym lookups, etc. More options and DQ measures were added for existing data quality functions like the pattern finder, string analyzer, and more. Profiling jobs can be reused, so you can define your processing flow once and run it on any data. Support for MS Excel 2007+ spreadsheets was added.
Release Notes: The MetaModel version was updated to 1.2, which adds support for two new datastores: dBase databases (.dbf files) and MS Access databases (.mdb files). A bug pertaining to text-file dictionary "file not found" errors was fixed. A lot of the other underlying libraries have been updated, providing improvements to performance and stability.
Release Notes: Memory use of the Value Distribution Profile was improved. It now does on-disk caching with the Berkeley db when necessary. The app is now a single JAR file that can be served through Java WebStart. The app automatically downloads regexes from the RegexSwap. A bug in matching number columns in dictionaries was fixed. A bug with invalid characters in XML formats was fixed. Suffix case is now ignored so that both .CSV and .csv files can be opened. The number of columns shown in the preview window is automatically restricted if there are too many to show on the screen.
Release Notes: An additional HTML export format has been added to the built-in export formats (usable when exporting Profiler results in the desktop app and when executing the runjob command-line tool). The export format can be chosen directly from the desktop app. Four new measures were added to the String Analysis profile: average characters and maximum/minimum/average white spaces.
Release Notes: The license was changed to LGPL. The profiler and validator can be executed using multiple threads. DataCleaner tasks can be executed from the command line for batch operation. More elaborate status information is given during profiler and validator execution. Date mask matcher and regex matcher profiles were added. A regex is loaded from the online RegexSwap repository. Popular database drivers are automatically downloaded and installed. More file types are supported, such as .dat and .txt. XML file support was improved. Memory improvements were made in the Time analysis profile. Logging when running profiling and validation was improved. An information schema is provided for file-based datastores. Columns in the datastore-tree are lazy-loaded.